Freeling analizer is not working with NER, dates, numerals

0

I am trying to use freeling to recognize and classify named entities in Spanish, I am testing with the analizer as I still don't understand how to use the Python API. So when using the analizer in a text it doesn't recognize or classify named entities or dates or anything of the style. this is what i do

analyze -f es.cfg --ner --nec --date < mytext > out

where mytext content is:

En ese contexto, el ministro de Salud Pública, José Angel Portal Miranda, reiteró que, con 2 205 casos confirmados y 83 pacientes fallecidos, Cuba continúa bajando la letalidad hasta un 3,76 %, lo cual nos mantiene en el lugar 18 entre los 35 países de las Américas que reportan casos positivos hasta ayer 10 de junio.

and output is:

En en SP 1    
ese ese DD0MS0 0.966694    
contexto contexto NCMS000 1    
, , Fc 1    
el el DA0MS0 1    
ministro ministro NCMS000 1    
de de SP 0.999961    
Salud_Pública salud_pública NP00V00 1    
, , Fc 1    
José_Angel_Portal_Miranda josé_angel_portal_miranda NP00SP0 1
...

and I don't see anywhere the classes of the entities nor the indicator that it is a date

es.cfg content is:

##
#### default configuration file for Spanish analyzer
##

#### General options 
Lang=es
Locale=default

### Tagset description file, used by different modules
TagsetFile=$FREELINGSHARE/es/tagset.dat

#### Trace options. Only effective if we have compiled with -DVERBOSE
#
## Possible values for TraceModule (may be OR'ed)
#define SPLIT_TRACE         0x00000001
#define TOKEN_TRACE         0x00000002
#define MACO_TRACE          0x00000004
#define OPTIONS_TRACE       0x00000008
#define NUMBERS_TRACE       0x00000010
#define DATES_TRACE         0x00000020
#define PUNCT_TRACE         0x00000040
#define DICT_TRACE          0x00000080
#define SUFF_TRACE          0x00000100
#define LOCUT_TRACE         0x00000200
#define NP_TRACE            0x00000400
#define PROB_TRACE          0x00000800
#define QUANT_TRACE         0x00001000
#define NEC_TRACE           0x00002000
#define AUTOMAT_TRACE       0x00004000
#define TAGGER_TRACE        0x00008000
#define HMM_TRACE           0x00010000
#define RELAX_TRACE         0x00020000
#define RELAX_TAGGER_TRACE  0x00040000
#define CONST_GRAMMAR_TRACE 0x00080000
#define SENSES_TRACE        0x00100000
#define CHART_TRACE         0x00200000
#define GRAMMAR_TRACE       0x00400000
#define DEP_TRACE           0x00800000
#define UTIL_TRACE          0x01000000

TraceLevel=0
TraceModule=0x0000

## Options to control the applied modules. The input may be partially
## processed, or not a full analysis may me wanted. The specific 
## formats are a choice of the main program using the library, as well
## as the responsability of calling only the required modules.
## Valid input/output formats are: plain, token, splitted, morfo, tagged, parsed
InputLevel=text
OutputLevel=morfo

# consider each newline as a sentence end
AlwaysFlush=no

#### Tokenizer options
TokenizerFile=$FREELINGSHARE/es/tokenizer.dat

#### Splitter options
SplitterFile=$FREELINGSHARE/es/splitter.dat

#### Morfo options
AffixAnalysis=yes
CompoundAnalysis=yes
MultiwordsDetection=yes
NumbersDetection=yes
PunctuationDetection=yes
DatesDetection=yes
QuantitiesDetection=yes
DictionarySearch=yes
ProbabilityAssignment=yes
DecimalPoint=,
ThousandPoint=.
LocutionsFile=$FREELINGSHARE/es/locucions.dat 
QuantitiesFile=$FREELINGSHARE/es/quantities.dat
AffixFile=$FREELINGSHARE/es/afixos.dat
CompoundFile=$FREELINGSHARE/es/compounds.dat
ProbabilityFile=$FREELINGSHARE/es/probabilitats.dat
DictionaryFile=$FREELINGSHARE/es/dicc.src
PunctuationFile=$FREELINGSHARE/common/punct.dat
ProbabilityThreshold=0.001

# NER options 
NERecognition=yes

# config file for "crf" machine learning NERC
# (recognition and classification in a single step)
NPDataFile=$FREELINGSHARE/es/nerc/nerc/nerc.dat

# config file for "basic" rule based NER
#NPDataFile=$FREELINGSHARE/es/np.dat

# config file for "bio" machine learning NER
# NPDataFile=$FREELINGSHARE/es/nerc/ner/ner-ab-poor1.dat
# NPDataFile=$FREELINGSHARE/es/nerc/ner/ner-ab-rich.dat
# "rich" model is trained with rich gazetteer. Offers higher accuracy but 
# requires adapting gazetteer files to have high coverage on target corpus.
# "poor1" model is trained with poor gazetteer. Accuracy is splightly lower
# but suffers small accuracy loss the gazetteer has low coverage in target 
# corpus. If in doubt, use "poor1" model.

## Phonetic encoding of words.
Phonetics=no
PhoneticsFile=$FREELINGSHARE/es/phonetics.dat

## NEC options. See README in common/nec
NEClassification=yes
NECFile=$FREELINGSHARE/es/nerc/nec/nec-ab-poor1.dat
#NECFile=$FREELINGSHARE/es/nerc/nec/nec-ab-rich.dat

## Sense annotation options (none,all,mfs,ukb)
SenseAnnotation=none
SenseConfigFile=$FREELINGSHARE/es/senses.dat
UKBConfigFile=$FREELINGSHARE/es/ukb.dat

#### Tagger options
Tagger=hmm
TaggerHMMFile=$FREELINGSHARE/es/tagger.dat
TaggerRelaxFile=$FREELINGSHARE/es/constr_gram-B.dat
TaggerRelaxMaxIter=500
TaggerRelaxScaleFactor=670.0
TaggerRelaxEpsilon=0.001
TaggerRetokenize=yes
TaggerForceSelect=tagger

#### Parser options
GrammarFile=$FREELINGSHARE/es/chunker/grammar-chunk.dat

#### Dependence Parser options
DependencyParser=lstm
DepLSTMFile=$FREELINGSHARE/es/dep_lstm/params-es.dat
#DependencyParser=txala
DepTxalaFile=$FREELINGSHARE/es/dep_txala/dependences.dat
#DependencyParser=treeler
DepTreelerFile=$FREELINGSHARE/es/treeler/dependences.dat

# Semantic Role Labelling options
SRLTreelerFile=$FREELINGSHARE/es/treeler/srl.dat

#### Coreference Solver options
#CorefFile=$FREELINGSHARE/es/coref/relaxcor_constit/relaxcor.dat
CorefFile=$FREELINGSHARE/es/coref/relaxcor_dep/relaxcor.dat
SemGraphExtractorFile=$FREELINGSHARE/es/semgraph/semgraph-SRL.dat

please help :)

python
nlp
ner
freeling
asked on Stack Overflow Jun 11, 2020 by Udl David

0 Answers

Nobody has answered this question yet.


User contributions licensed under CC BY-SA 3.0