Various improvements to REST service (!4) · Merge requests · COMPRISE / Text Transformer

Ian Roberts requested to merge elg-in-process into elg May 15, 2021

This merge request combines a number of suggestions for improvements to the Flask-based REST service to make it more efficient and more suitable for integration into the ELG.

Load the model in-process

Currently the REST service spawns a new sub-process for the transformer command line tool for every incoming request. This is inefficient since the sub-process must load the NER model afresh every time, and the way temporary files are used to pass the text back and forth mean that there is a race condition whereby if the service is run with more than one gunicorn worker and both workers are handling requests at the same time, both callers may see the same set of results (from one or other of the calls) rather than each seeing their own.

This merge changes things so that rather than spawning sub-processes, the Flask app loads the models up front once at startup time, into the same Python process, then uses the loaded model to handle all requests.

Improved compatibility with ELG API formats

For the ELG endpoint, I've made various changes to make better use of the ELG API specification:

The "texts" response format supports returning an array of texts, not just a single one, so I've made it return each sentence as a separate entry in the "texts" response array and marked them as "role":"sentence" to indicate they are separate sentences rather than alternatives for a single segment.
Similarly there is a "structuredText" request type for use when the input text has already been segmented by the caller. I've extended the service to support both request types - a "text" request will be split into lines as before, a "structuredText" will be assumed to be already split into sentences.
Errors need to be reported in the correct ELG JSON format, I've added exception handling to do this rather than returning the default Flask HTML error page
On the basis of "never trust your users" I've also added proper validation of the replace_type and replace_prob parameter values, returning valid ELG error messages if the values are wrong.

Edited May 15, 2021 by Ian Roberts

Admin message

Various improvements to REST service

Load the model in-process

Improved compatibility with ELG API formats

Merge request reports