NLP Final Project

debiasing

Dependencies:

allennlp
scipy
numpy
sklearn
ElmoForManyLangs (https://github.com/HIT-SCIR/ELMoForManyLangs)

(x)BERT STUFF

bert_debiasing.py

Contains code for fine-tuning a bert model on the NLI task. Depends on: https://github.com/microsoft/nlp-recipes/tree/master/utils_nlp

Also, depends on huggingface/transformers: https://github.com/huggingface/transformers

Steps

We need to get bert_entailment.py up and running on a real GPU for one of the simplified bert models (alberta or distilbert).

Once that's done, we need to get edit the loss function in the corresponding "model" in the transformers repo. - save (pickle) our basis info to incorporate it into the loss function

To make ^ Possible, we want to install transformers using the "from source" method, with a pip install --user -e . or something similar.

Our folder structure should look like root - debiasing (this repo) - .../bert_entailment.py training file - nlp-recipes/utils_nlp <- Can be a direct clone of the actual repo, or some sort of local install via PIP? So this folder may be optional - transformers <- This should be our fork of huggingface/transformers, which we will edit to change the loss function, etc.

Alternatively: Put everything as subfolders in the debiasing root, and just delete all the .git stuff from within them

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
Data		Data
gp_debias/wordlist		gp_debias/wordlist
kmeans		kmeans
masked		masked
transformers/transformers		transformers/transformers
utils_nlp		utils_nlp
.gitignore		.gitignore
README.md		README.md
bert_entailment.py		bert_entailment.py
create_debiasing_mask_data.py		create_debiasing_mask_data.py
definitional_pairs.txt		definitional_pairs.txt
definitional_words.txt		definitional_words.txt
evaluate_lm.py		evaluate_lm.py
evaluate_lm_bias.py		evaluate_lm_bias.py
helpers.py		helpers.py
mask_debias_dataset.py		mask_debias_dataset.py
masked_evaluation_data.jsonl		masked_evaluation_data.jsonl
masked_training_data.jsonl		masked_training_data.jsonl
masked_training_data_from_reddit.jsonl		masked_training_data_from_reddit.jsonl
reddit_training_data.jsonl		reddit_training_data.jsonl
svm_analysis_bert.py		svm_analysis_bert.py
train_lm.py		train_lm.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NLP Final Project

debiasing

Dependencies:

(x)BERT STUFF

bert_debiasing.py

Steps

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NLP Final Project

debiasing

Dependencies:

(x)BERT STUFF

bert_debiasing.py

Steps

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages