We can identify four steps to follow: 

Step 1: Covid database study and identifciation

Step 2: Preprocessing 

This part will be explained in detail in this article. 

With 4 substeps: 

  • Cleaning JSON files
  • Eliminate duplicates
  • Select language ENGLISH
  • Generate STOPWORDS

Step 3: Processing

This part is explained in detail here. 

  • Extract keywords form abstracts organized per tasks
  • Match Keywords with documents
  • Apply N.E.R
  • Eliminate duplicates 
  • Summary extraction 

Step 4: Results

To see better this architecture, you can go here. 

  • N.B.: For this part more than one model was analyzed. You can find those models on in the links below.