The NLP Interchange Format (NIF) is an RDF/OWL-based format that aims to achieve interoperability between Natural Language Processing (NLP) tools, language resources and annotations. NIF consists of specifications, ontologies and software (overview), which are combined under the version identifier "NIF 2.0", but are versioned individually.
This specification complements the NIF 2.0 Core specification by specifying in detail how the interface for a NIF implementation must behave. The focus here is on how to access the tools and web services and not on what the transfered data contains. We distinguish Web Service (NIF-WS) and Command Line Interface (NIF-CLI) which are called in sum NIF implementations.
Note that only the parameter input
is required during a request. If this document mentions "required" parameters, it means that in order to be introperable with other NIF implementations and clients, you are required to implement this parameter in your NIF implementation.
Overall this specification contains:
input
(i
):
informat
and intype
input
is "-" ("--input -" or "-i -" )informat
(f
):
turtle
(default)text
json-ld
is scheduled to be included, if enough implementations exist.
intype
(t
):
direct
(default)url
file
(only CLI)outformat
(o
):
turtle
(default)text
json-ld
is scheduled to be included, if enough implementations exist.
urischeme
(u
):
RFC5147String
(default)CStringInst
prefix
(p
): free parameter; SHOULD implement a sensible defaultinfo
or info=true
or help
SHOULD say which parameters are implemented.ntriples
, rdfxml
for the parameter informat
and outformat
, if it does not require a lot of extra work (e.g built-in by the used RDF library).input
data is given.port
(default: 8899
) and then listen on this port and accept all NIF-WS parameters.intype
is direct
, then NIF-CLI MUST either:
--input "My favourite actress is NataliePortman" -f text
input
is just a "-" sign: --input -
or -i -
Parameter | Description |
---|---|
input (i )
|
InputThis is the serialized data (i.e. the text or the NIF RDF in Turtle or other formats) Since the value of the parameter contains the transfered data which has to be processed by the tool, we require additional parameters to specify the data. Input Type (see below) specifies how the data is retrieved (e.g.direct , url or via file ).
Input Format (see below) specifies in what format the retrieved data is (e.g. text or turtle or json-ld ).
NIF-CLI: If intype is direct , and input is "-", then NIF-CLI MUST read from stdin:
echo -n "My favourite actress is Natalie Portman." |\ java -jar nif-cli.jar --informat text -i - |
informat (f )
|
Input FormatDetermines in which format theinput is given. Required values are:
json-ld is scheduled to be included, if enough implementations exist.Furthermore, these optional values MAY be implemented: |
intype (t )
|
Input TypeDetermines howinput is accessed or retrieved. Values are:
cat textfile.txt | java -jar nif-cli.jar --informat text --input - |
outformat (o )
|
Output FormatThe format in which the output is serialized.
json-ld is scheduled to be included, if enough implementations exist.Furthermore, these optional values MAY be implemented:
|
urischeme (u )
|
URI SchemeThe URI Scheme the NIF implementation must use.
|
prefix (p )
|
PrefixThe prefix, which the NIF implementation MUST use to create and parse URIs.
Examplesinput is "My favourite actress is Natalie Portman.";
<http://example.org#char=0,40> rdf:type nif:RFC5147String , nif:Context ; nif:beginIndex "0" ; nif:endIndex "40" ; nif:isString "My favourite actress is Natalie Portman." . <http://example.org/whatever/char=0,40> rdf:type nif:RFC5147String , nif:Context ; nif:beginIndex "0" ; nif:endIndex "40" ; nif:isString "My favourite actress is Natalie Portman." .
<http://example.org/nif?char=0,40> rdf:type nif:RFC5147String , nif:Context ; nif:beginIndex "0" ; nif:endIndex "40" ; nif:isString "My favourite actress is Natalie Portman." . |
curl "http://nlp2rdf.lod2.eu/nif-ws.php?input=My%20favourite%20actress%20is%20Natalie%20Portman.&informat=text" # or curl --data-urlencode input="My favourite actress is Natalie Portman." -d informat=text "http://nlp2rdf.lod2.eu/nif-ws.php" # using Accept: curl --data-urlencode input="My favourite actress is Natalie Portman." -H "Accept: text/plain" "http://nlp2rdf.lod2.eu/nif-ws.php"
curl -X POST --data-urlencode input="My favourite actress is Natalie Portman." -d informat=text "http://nlp2rdf.lod2.eu/nif-ws.php" curl -X POST --data-urlencode input="My favourite actress is Natalie Portman." -H "Accept: text/plain" "http://nlp2rdf.lod2.eu/nif-ws.php"
prefix
parametercurl --data-urlencode input="My favourite actress is Natalie Portman." -d informat=text \ --data-urlencode prefix="http://example.org/nif#" "http://nlp2rdf.lod2.eu/nif-ws.php" curl --data-urlencode input="My favourite actress is Natalie Portman." -d informat=text \ --data-urlencode prefix="http://example.org/nif/" "http://nlp2rdf.lod2.eu/nif-ws.php" curl --data-urlencode input="My favourite actress is Natalie Portman." -d informat=text \ --data-urlencode prefix="http://example.org/nif?" "http://nlp2rdf.lod2.eu/nif-ws.php" # using md5("My favourite actress is Natalie Portman.") = ae0aaa2ad528f072356827042afc6011 as prefix curl --data-urlencode input="My favourite actress is Natalie Portman." -d informat=text \ --data-urlencode prefix="http://example.org/ae0aaa2ad528f072356827042afc6011#" "http://nlp2rdf.lod2.eu/nif-ws.php"
# -f or --informat specifies the format (text, turtle, rdfxml) # -t or --intype specifies the input type (direct, file, url) echo -n "My favourite actress is Natalie Portman." | java -jar stanfordNIF-beta.jar -f text -t direct -i - # -t can be omitted since direct is the default echo -n "My favourite actress is Natalie Portman." | java -jar stanfordNIF-beta.jar -f text -i -
# -t or --intype specifies the input type (direct, file, url) # -i or --input specifies the input echo "My favourite actress is Natalie Portman." > text.txt java -jar stanfordNIF-beta.jar -f text -i text.txt -t file
informat
and outformat
.
NIF-WS implementations MUST always set the appropriate Content-Type:
Header.informat
and outformat
to media types:
turtle
is the same as text/turtle
json-ld
is the same as application/ld+json
text
is the same as text/plain
html
is the same as text/html
rdfxml
is the same as application/rdf+xml
ntriples
is the same as application/n-triples
(we use the latest spec as source)timedtext
is the same as application/ttml+xml
xml
is the same as application/xml
Code | Text | Description |
200 | OK | Success! |
400 | Bad Request | The request was invalid. An accompanying error message will explain why. |
401 | Unauthorized | Authentication credentials were missing or incorrect. |
406 | Not Acceptable | Returned by the API when an invalid format is specified in the request. |
500 | Internal Server Error | Something is broken. Contacting the maintainer might be appropriate. |
503 | Service Unavailable | The server is currently unable to handle the request due to a temporary overloading or maintenance of the server. |
port (no short form) |
Port Number
|
info (no short form) |
InfoIf info=true the NIF implementation SHOULD display all implemented parameters. TODO this will be RDF as well in the future. |
apikey (k )
|
Api KeyIf access is limited, the client MUST use this parameter to give its authentication token or api key. Note: If your NIF implementation doesn't require authentification, this parameter SHOULD be ignored. |
help (h )
|
HelpPrint help.
|
config (c )
|
ConfigA string which can be used to configure the NIF implementation.
|
configfile (cf )
|
Config FileA file which can be used to configure the NIF implementation.
|
logprefix (lp )
|
Log PrefixTODO this parameter is still informative. No implementation necessary. Used to create prefix for log URIs in the same way asprefix . please use a sensible default.
|
profile (pr )
|
Profile
simple , stanbol , oa .
|
uspara (up )
|
URI Scheme ParametersTODO this parameter is still informative. No implementation necessary. Some parameters for certain urischemes, e.g. contextlength |
outfile (of )
|
Output FileA file into which results of NIF-CLI should be written. Note: this option is for operating systems that do not use pipes. |
@prefix rlog: <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/rlog#> . @prefix logprefix: <http://example.org/user-defined-logPrefix#> . logprefix:user-defined-urn a rlog:Entry ; rlog:level rlog:ERROR ; rlog:date "2013-06-08T17:00:00Z"^^xsd:datetime ; rlog:message "Log message" ; # optional: rlog:resource <http://example.com/some-RDF-resource> .
# default prefix, non dereferencable @prefix logprefix: <http://nlp2rdf.lod2.eu/instance/log/> . # arbitrary id at the moment. logprefix:id_ERROR_0_1377165120346 a rlog:Entry ; rlog:date "2013-08-22T09:52:00.347Z"^^xsd:dateTime ; rlog:level rlog:ERROR ; rlog:message """http://example.com/error4.txt#char=1,25: for the context, the length of nif:isString (27) must equal nif:endIndex (25)""" ; rlog:resource <http://example.com/error4.txt#char=1,25> .