it.jrc.entitymatcher
Class Entity
java.lang.Object
it.jrc.entitymatcher.Entity
public class Entity
- extends java.lang.Object
Constructor Summary |
Entity()
|
Entity(char[] data,
int index,
java.lang.String word,
int cIndex)
|
Method Summary |
void |
appendTo(java.lang.StringBuilder sb)
|
char[] |
charCode()
returns the entity metadata encoded in a char[4] array
2 characters for the ineteger ID (32 bits)
1 character for the type
1 character for the INDEX in the code2lan array |
static char[] |
charID(int id)
|
void |
init(java.lang.String line)
Parse an entity from a line
the fomat of the line should be (whitespace separated)
id type language alias (note that in this implementation any - signs are ignored... |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
id
public int id
type
public char type
lan
public java.lang.String lan
alias
public java.lang.String alias
languageCode
public int languageCode
charIndex
public int charIndex
Entity
public Entity()
Entity
public Entity(char[] data,
int index,
java.lang.String word,
int cIndex)
charID
public static char[] charID(int id)
init
public void init(java.lang.String line)
- Parse an entity from a line
the fomat of the line should be (whitespace separated)
id type language alias (note that in this implementation any - signs are ignored... )
id: the numeric id of the entity (Integer)
type: one character, will be set to lowercase, normally p (person) or o (other)
language: 'u' for all languages or 2 character ISO language code (very occasionally 3 e.g. 'pap') will be set to lowercase
alias: a string representing the entity (possibly restricted by language code)
charCode
public char[] charCode()
- returns the entity metadata encoded in a char[4] array
2 characters for the ineteger ID (32 bits)
1 character for the type
1 character for the INDEX in the code2lan array
appendTo
public void appendTo(java.lang.StringBuilder sb)