Thanks to visit codestin.com
Credit goes to github.com

Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 3 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,7 @@ The Data Review Tool can be launched by running the following command from the r
docker-compose up --build data-review-tool
```

Once the image is built and the container is running, the Data Review Tool can be accessed at http://localhost:8050/. There is a sample "extracted entities" JSON file provided for demo purposes.
Once the image is built and the container is running, the Data Review Tool can be accessed at <http://localhost:8050/>. There is a sample "extracted entities" JSON file provided for demo purposes.

### Data Requirements

Expand All @@ -109,7 +109,7 @@ The article relevance prediction component requires a list of journals that are

#### Data Extraction Pipeline

As the full text articles provided by the xDD team are not publicly available we cannot create a public link to download the labelled training data. For access requests please contact Ty Andrews at [email protected].
As the full text articles provided by the xDD team are not publicly available we cannot create a public link to download the labelled training data. For access requests please contact Ty Andrews at <[email protected]>.

### Development Workflow Overview

Expand Down Expand Up @@ -143,9 +143,6 @@ WIP
│ │ ├── processed/ <- Processed data
│ │ └── interim/ <- Temporary data location
│ ├── data-review-tool/ <- Directory for data related to data review tool
│ │ ├── raw/ <- Raw unprocessed data
│ │ ├── processed/ <- Processed data
│ │ └── interim/ <- Temporary data location
├── results/ <- Directory for results
│ ├── article-relevance/ <- Directory for results related to article relevance prediction
│ ├── ner/ <- Directory for results related to named entity recognition
Expand All @@ -172,7 +169,7 @@ The UBC MDS project team consists of:

- **Ty Andrews**
- **Kelly Wu**
- **Jenit Jain**
- [![ORCID](https://img.shields.io/badge/orcid-0009--0007--8913--2403-brightgreen.svg)](https://orcid.org/0000-0002-7926-4935) [Jenit Jain](https://ht-data.com/)
- **Shaun Hutchinson**

Sponsors from Neotoma supporting the project are:
Expand Down
Binary file not shown.
Binary file not shown.
Empty file.
Empty file.
Empty file removed data/data-review-tool/raw/.gitkeep
Empty file.
3,343 changes: 0 additions & 3,343 deletions data/data-review-tool/raw/5681abe7cf58f1ba274d47fb.json

This file was deleted.

9 changes: 6 additions & 3 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,13 @@ services:
build:
dockerfile: ./docker/data-review-tool/Dockerfile
context: .
environment:
- ARTICLE_RELEVANCE_BATCH=article-relevance-output.parquet
- ENTITY_EXTRACTION_BATCH=entity-extraction-output.zip
ports:
- "8050:8050"
volumes:
- ./data/data-review-tool:/MetaExtractor/data/data-review-tool
- ./data/data-review-tool:/MetaExtractor/inputs:rw
entity-extraction-pipeline:
image: metaextractor-entity-extraction-pipeline:v0.0.3
build:
Expand All @@ -17,8 +20,8 @@ services:
ports:
- "5000:5000"
volumes:
- ./data/entity-extraction/raw/original_files/:/inputs/
- ./data/entity-extraction/processed/processed_articles/:/outputs/
- ./data/entity-extraction/raw/original_files/:/inputs/
- ./data/entity-extraction/processed/processed_articles/:/outputs/
environment:
- USE_NER_MODEL_TYPE=huggingface
- LOG_OUTPUT_DIR=/outputs/
Expand Down
14 changes: 7 additions & 7 deletions docker/data-review-tool/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -13,16 +13,16 @@ ENV LOG_LEVEL=DEBUG
# Copy the entire repository folder into the container
COPY src ./src

# RUN git clone https://github.com/NeotomaDB/MetaExtractor
# WORKDIR MetaExtractor/
# RUN git switch dev

# Expose the port your Dash app is running on
EXPOSE 8050

RUN pip install pyarrow

RUN mkdir -p ./inputs

VOLUME ["/MetaExtractor/inputs"]

# Set the entrypoint command to run your Dash app
#CMD ["python", "src/data_review_tool/app.py"]

ENTRYPOINT python src/data_review_tool/app.py

# VOLUME [ "/MetaExtractor/data/data-review-tool" ]
ENTRYPOINT python src/data_review_tool/app.py
Loading