CVE database
This repo is made to generate good representation of CVE and extract pull request informations, file before and file after.
CVE
The CVE are security vulnerability. They are referenced (but not all) in the NVD.
GHSA
GitHub Advisories (GHSA) is a database of CVEs and GitHub-originated security advisories affecting the open source world.
connexion to github
Tomake request with the github api you must authenticate using a header : Authorization: token [tocken]
.
To get the files : here
Rate limit
Their is 2 rate limit on the github api : The primary one and the secondary one.
Primary limit
The primary rate limit (with auth) is 1000 request per hour and per repo (not a prob)
Secondary rate limit
The secondary is 900 points per minutes and most GET request are worst 1 point. It is 15 request per second. per security, I will make a wait of 0.1s per request. Theyr is no ay to check is.
get the files of the pull request
We use the api github to retrieve the file of the pull request. The api is : /repos/{owner}/{repo}/pulls/{pull_number}/files
. This return the url to the patched file and the patch.
get files in commit
Get the commit info here : /repos/{owner}/{repo}/commits/{commit_sha}
Then use the files
field to get the files.
For each file, you can get the content with the url https://raw.githubusercontent.com/{owner}/{repo}/{commit_hash}/{file_path_in_commit}
and the patch directly in the patch
field.
fixe patch
Some CVE with brocken patch : CVE-2014-1944, CVE-2014-2061, CVE-2014-3691, CVE-2014-7193