README.md 3.55 KB
Newer Older
MERABTI Tayeb's avatar
MERABTI Tayeb committed
1
2
# Introduction
ConnectionLens ingests heterogeneous data sources (JSON, XML, HTML, CSV, RDF, relational databases, text files, PDF files) into a single, integrated graph. It stores these graphs, then allows querying them using keywords, and exploring them through a GUI.
MERABTI Tayeb's avatar
MERABTI Tayeb committed
3

MERABTI Tayeb's avatar
MERABTI Tayeb committed
4
5
6
7
8
The software consists of two main modules (sub-projects): 
 - **Core** provides the main project functionalities (data ingestion, graph storage, querying)
 - **Gui** is a standalone J2EE web app which allows interactive exploration of the graph. 


MERABTI Tayeb's avatar
MERABTI Tayeb committed
9
# Download & Install
MERABTI Tayeb's avatar
MERABTI Tayeb committed
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

### Software prerequisites
Required: 
- Java >= 1.8
- Maven >= 3.0.5
- PostgreSQL (tested with v.9.6)
- Python 3
- Tomcat >=8.*

Optional

- [Graphviz (DOT)](https://www.graphviz.org/) 

### Download

MERABTI Tayeb's avatar
MERABTI Tayeb committed
25
To download connectionLens, you need to clone this git repository:
MERABTI Tayeb's avatar
MERABTI Tayeb committed
26

MERABTI Tayeb's avatar
MERABTI Tayeb committed
27
The cloned folder will include:
MERABTI Tayeb's avatar
MERABTI Tayeb committed
28

MERABTI Tayeb's avatar
save    
MERABTI Tayeb committed
29
  1. The `core` folder which provides:
MERABTI Tayeb's avatar
MERABTI Tayeb committed
30
31
32
33
34
     
   - the jar file `connection-lens-core-full-1.1-SNAPSHOT.jar`, 
   - the python scripts folder (to run Flair),
   - a properties file with the default parameters.

MERABTI Tayeb's avatar
save    
MERABTI Tayeb committed
35
  2. The `gui` folder which provides the war file `gui.war` to run the web app. 
MERABTI Tayeb's avatar
MERABTI Tayeb committed
36
37
38
39
   
  3. The `data` folder with some samples with differents formats: rdf, xml, html...
   
  4. The `models` folder which provides all the models in French and English used within `TreeTagger` and `Stanford` tool. 
MERABTI Tayeb's avatar
MERABTI Tayeb committed
40
41
42
43
44
45
46
47



## Installation Instructions

In order to run ConnectionLens (Jar and War) you need to follow installation instructions for Core and Gui:


MERABTI Tayeb's avatar
save    
MERABTI Tayeb committed
48
- [ConnectionLens-Core installation instructions](core_install.md)
MERABTI Tayeb's avatar
MERABTI Tayeb committed
49

MERABTI Tayeb's avatar
save    
MERABTI Tayeb committed
50
- [ConnectionLens-Gui installation instructions](gui_install.md)
MERABTI Tayeb's avatar
MERABTI Tayeb committed
51
52
53


# Example
MERABTI Tayeb's avatar
MERABTI Tayeb committed
54
This example explain how you ingest small dataset (4 data sources) with different formats and querying the graph obtained using ConnectionLens. 
MERABTI Tayeb's avatar
save    
MERABTI Tayeb committed
55

MERABTI Tayeb's avatar
MERABTI Tayeb committed
56
57
#### Ingesting a small dataset: 
From the main folder, call the the jar in the core folder with the following options:
MERABTI Tayeb's avatar
MERABTI Tayeb committed
58

MERABTI Tayeb's avatar
MERABTI Tayeb committed
59
```
60
java -jar core/connection-lens-core-full-1.1-SNAPSHOT.jar -DRDBMSDBName=cl_myinstance -i data/poc/2/deputes.json,data/poc/2/fb-etienne-chouard.txt,data/poc/2/medias.txt,data/poc/2/tweet-Ruffin.json,data/poc/2/rt-wikipedia.txt
MERABTI Tayeb's avatar
MERABTI Tayeb committed
61
```
MERABTI Tayeb's avatar
MERABTI Tayeb committed
62

63
64
![image_3.png](./image_3.png)

65

MERABTI Tayeb's avatar
MERABTI Tayeb committed
66
#### Asking queries in interactive mode: on a database created/loaded as above, call the code with the following options 
MERABTI Tayeb's avatar
MERABTI Tayeb committed
67
```
MERABTI Tayeb's avatar
MERABTI Tayeb committed
68
java -jar core/connection-lens-core-full-1.1-SNAPSHOT.jar -DRDBMSDBName=cl_myinstance -n -v -a
MERABTI Tayeb's avatar
MERABTI Tayeb committed
69
```
MERABTI Tayeb's avatar
MERABTI Tayeb committed
70

MERABTI Tayeb's avatar
MERABTI Tayeb committed
71
The `query>` indicates that the shell is ready to accept queries.
MERABTI Tayeb's avatar
MERABTI Tayeb committed
72

73
![image_2.png](./image_2.png)
MERABTI Tayeb's avatar
MERABTI Tayeb committed
74
75
76



MERABTI Tayeb's avatar
MERABTI Tayeb committed
77
78

#### Collecting statistics on a set of query (stored in a text file, one query per line) 
MERABTI Tayeb's avatar
MERABTI Tayeb committed
79
80
81
82
```
java -jar core/connection-lens-core-full-1.1-SNAPSHOT.jar -DRDBMSDBName=cl_myinstance -n -qs -Q core/data/poc/2/demo.queries

```
MERABTI Tayeb's avatar
MERABTI Tayeb committed
83
84
85
86
87
88
89
90
 
#### Sample queries on the small example and the number of expected answer trees (ATs): 
Quotes can be used to denote groups of keywords to consider as an *atomic*
keyword, i.e. a string between quotes will be matched *exactly* against the 
graph.

Russie - 1 AT

MERABTI Tayeb's avatar
MERABTI Tayeb committed
91
Ruffin - 13 ATs
MERABTI Tayeb's avatar
MERABTI Tayeb committed
92

MERABTI Tayeb's avatar
MERABTI Tayeb committed
93
Poutine - 1 AT
MERABTI Tayeb's avatar
MERABTI Tayeb committed
94
95
96

Assemblée - 3 ATs

MERABTI Tayeb's avatar
MERABTI Tayeb committed
97
Soral Toulon - 2 ATs
MERABTI Tayeb's avatar
MERABTI Tayeb committed
98

MERABTI Tayeb's avatar
MERABTI Tayeb committed
99
Briand Halluin Tonolli - 1 AT
MERABTI Tayeb's avatar
MERABTI Tayeb committed
100

MERABTI Tayeb's avatar
MERABTI Tayeb committed
101
102
103
104
105

# Contributing to ConnectionLens

If you found a bug or issue with ConnectionLens please let us know. You can report bugs on the [issue](https://gitlab.inria.fr/cedar/connectionlens/-/issues) tracker.

MERABTI Tayeb's avatar
MERABTI Tayeb committed
106
107
108
109
110
111
112
113
114
## How to report an issue

- Create or/and login with your gitlab inria account. If you are not from Inria, please contact ioana.manolsecu@inria.fr or tayeb.merabti@inria.fr to be sponsored.

- Open a bug entry on https://gitlab.inria.fr/cedar/connectionlens/-/issues.

- Do not forget to mention your OS and all important information.