RTLTDS Data Set

The RTLTDS dataset is a repository of text documents. The text documents available in this dataset contain the 3157 articles of the conferences held in Iran from 1366 to 1395 Hijri Shamsi. The total number of articles in this dataset is 559159 that each article is presented in three formats: txt, json and xml. This data set is very suitable for various studies and research fields, including Data Mining and Big Data. The files in this dataset are collected from scientific articles indexed on the Civilica site.
Also based on the data in the RTLTDS dataset, a data warehouse was created. The schema of this data warehouse is star schema and consists of the 5 dimension tables with one fact table. This data warehouse is suitable for various research fields such as decision support systems and OLAP services.

  • Data Set News

    ▶ The website of dataset is moved to the Github (2018-12-04).

    ▶ The RTLTDS Data Warehouse is published (2018-03-06).

    ▶ The RTLTDS Data Set is published (2017-11-22).