Culturepedia
It is a platform that allows you to conveniently explore cultural content containing performances of various genres, provides personalized recommendations by analyzing user tastes, and helps you make better choices through in-depth reviews.
- Planning and Documentation: 2024.09.23 ~ 2024.09.27
- Development: 2024.09.27 ~ 2024.10.23
| API | FEATURES | FRONTEND | BACKEND |
|---|---|---|---|
| ACCOUNTS | Sign up | 김상아 | 김채림 |
| Log in | 김상아, 김채림 | 김채림 | |
| Log out | 김상아 | 김상아 | |
| Profile view | 김채림 | 김상아 | |
| Profile Edit | 김상아 | 김상아 | |
| Account Deletion | 김채림 | 김채림 | |
| PERFORMANCES | Data Pipeline Construction | - | 이광욱, 김상아 |
| KOPIS API Crawling | 이광욱 | 김채림 | |
| Search | 이광욱 | 김상아 | |
| Detail Page | 이광욱 | 김상아 | |
| Favorites | 김채림 | 김상아 | |
| Reviews | 김채림 | 김상아 | |
| Category-Based Recommendations | 김채림 | 김채림 | |
| Cookie and Session Storage | 김채림, 이광욱 | 김채림 | |
| AI Hashtaging | - | 이광욱 | |
| Scheduling | - | 이광욱 | |
| Image Resizing | - | 이광욱 | |
| Deployment | Deployment | 김상아 | |
| Debugging | Debugging, Improvements Based on UT (Unit Testing) | 공통 | |
- Backend: Python, Django REST Framework (DRF), LLM
- Frontend: HTML, CSS, JS, Bootstrap
- Database: SQLite, PostgreSQL
- Version Control: Github, AWS
- IDE: VSCode
- APIs Used: KOPIS API (Korea Performance Information System), OpenAI API
- Other Tools and Libraries:
- APScheduler==3.10.4
- asgiref==3.8.1
- certifi==2024.8.30
- charset-normalizer==3.3.2
- Django==4.2
- django-apscheduler==0.7.0
- django-seed==0.3.1
- djangorestframework==3.15.2
- djangorestframework-simplejwt==5.3.1
- ElementTreeFactory==1.0
- Faker==30.1.0
- idna==3.10
- pillow==10.4.0
- psycopg2==2.9.9
- PyJWT==2.9.0
- python-dateutil==2.9.0.post0
- pytz==2024.2
- requests==2.32.3
- six==1.16.0
- sqlparse==0.5.1
- toposort==1.10
- typing_extensions==4.12.2
- tzdata==2024.2
- tzlocal==5.2
- urllib3==2.2.3
- xmltodict==0.13.0
-
Sign up
-
Log in
-
Log Out
- Users can log out, and their JWT token will be invalidated.
-
Profile Edit
-
Account Deletion
- Users can delete their account by entering their password. This removes their data from the system.
-
Profile View: Users can view their profile details (email, username, gender, birthday) and access their performance viewing history.
- Viewing History
- Users can see the performances they have reviewed.
- Favorites
- Users can view a list of performances they have marked as favorites.
- Viewing History
- Browse Performances
- Search Performances
- Performance Details
- Write Reviews
- Edit Reviews
- Users can edit reviews they have written.
- Delete Reviews
- Add to Favorites
- Users can add performances to their favorites list.
- Remove from Favorites
- Performance Recommendations
- Sign Up (accounts)
- Allows new users to register.
- Log In (signin)
- Users can log in using their email and password.
- Log Out (signout)
- Users can log out and terminate their session.
- Profile Edit (modify)
- Users can update their profile information.
- Account Deletion (delete)
- Users can delete their account.
- Profile View (profile)
- Displays the user's profile.
- Performance List View (performances)
- All users can view performance data.
- Performance Search (search)
- Performances can be searched by title, actor, venue, or production company.
- Performance Details View (detail)
- Displays detailed information about a performance.
- Performance Reviews (create/edit/delete)
- Users can write, edit, and delete reviews for performances.
- Add/Remove Favorites
- Users can add performances to or remove them from their favorites list.
- UI/UX Design
- Build the user interface (UI) using Bootstrap.
- Sign Up/Log In/Log Out Pages
- Pages for managing user accounts.
- Performance Review and Favorites Pages
- Provide UI for users to write reviews and manage their favorite performances.
- Profile Page
- Implement a page where users can view and edit their profile information.
- Review Table
- Stores review data for each performance.
- Favorites Table
- Stores a list of performances favorited by users.
- User Table
- Stores user information and profiles.
- KOPIS API
- Provides performance lists and detailed performance information.
- OpenAI API
- Utilizes LLM to offer additional AI-powered features.
https://documenter.getpostman.com/view/38012126/2sAY4rE4t4
CULTUREPEDIA
│
├── accounts/ # Handles user authentication and permissions
├── culturepedia/ # Project configuration files (e.g., settings.py)
├── performances/ # Contains features for performance browsing, searching, and recommendations
│ └── bots.py # Handles interactions with the OpenAI API
│ └── tasks.py # Scheduler for saving data from the KOPIS API to the local database
├── static/ # Static files (CSS, images, JavaScript, HTML) served to the client for faster loading
│ ├── css/ # Stylesheets that define the look and feel of the website
│ ├── img/ # Performance-related images used across the site
│ ├── js/ # Client-side scripts that handle dynamic interactions on the web pages
├── facility.py # Stores detailed data about performance venues from the KOPIS API
├── kopis_api_detail.py # Stores detailed data about specific performances from the KOPIS API
├── kopis_api.py # Stores performance list data from the KOPIS API
├── db.sqlite3 # SQLite database file
├── .gitignore # Specifies which files to ignore in version control
├── manage.py # Django project management script
├── requirements.txt # List of required Python packagesAPI DB vs Server DB : Storing large amounts of performance data locally can put a strain on the server, so it is recommended to query the API DB as it updates in real time.
| Issue | Cause | Solution |
|---|---|---|
| Missing data during lookup and search, and server load. | Instead of searching for all performances, only a limited number of performances per page are searched. | Increase the number of performances ( rows) retrieved per page and limit the duration of the performance (start_date, end_date). Caution : If there are many performances and periods, inquiry and search times can be greatly increased. |
- Due to the complicated code and traffic problems, it has been changed to store it in a local DB to inquire and retrieve it.
| Issue | Cause | Solution |
|---|---|---|
| Missing results due to performance lookup and search failures | Not all performances are stored, only those within a certain period. | It stores data periodically from the beginning of development, excluding past performance. This mitigates traffic problems when querying and retrieving from API DB. |
| The performance API is in XML format, which raises concerns about storing the data in Django. | It was not a json file format learned while using an existing django. | Use the 'xml toict' module to parse and convert XML into JSON format. |
| Performance details in JSON format are not stored in the DB. | The performance venue is linked via a foreign key in the model. | Fix the upload order so that the venue is saved first, followed by the performance details. |
| As time passes, performance information needs to be updated. | The API DB is always up-to-date, so this issue didn’t arise there. | Initially considered using triggers based on the performance period (start_date, end_date) to update the performance state, but realized that other detailed information might also need updates. Decided to compare the local DB and API DB every midnight for updates using a scheduler and subprocess. |
| Subprocess is not being executed. | Subprocess is referencing a different Python environment. | Specify the Python virtual environment path. |
| Scheduler is being executed multiple times. | No code exists to prevent duplicate executions. | Write code to check if the scheduler is already running. |
| Data is not being updated automatically. | The code that saves performances specifies the date and page, preventing automatic updates. | Modify the code to retrieve all pages starting from the current date using a while loop. |
| Issue | Cause | Solution |
|---|---|---|
| Network error (connection timed out) when attempting to connect to the server. | Subnet issue → Network ACL check → Allow/Deny → Marked as (X) Deny, which caused the problem. | Edit Network ACL settings → Inbound rules → Add new rule → Set to Allow. |
| Issue | Cause | Solution |
|---|---|---|
| Sending an image URL to OpenAI sometimes results in successful analysis, and sometimes not. | OpenAI does not seem to recognize the URL properly. | Download and resize the image within the project, then analyze the resized image. |
def resizing_images(url_list, quality=85): # quaility: 70 ~ 90, default: 85
img_folder = os.path.join(settings.STATICFILES_DIRS[0], "img")
if not os.path.exists(img_folder):
os.makedirs(img_folder)
for url in url_list:
try:
# Download an image from the URL
response = requests.get(url)
response.raise_for_status() # HTTP 요청 에러 체크
# Image File Name Extract
filename = os.path.basename(url)
# Create a folder by performance.
folder_name = f"{img_folder}/{filename.split('_')[1]}"
if not os.path.exists(folder_name):
os.mkdir(folder_name)
# Open the image in memory.
im = Image.open(BytesIO(response.content))
# Save Image
im.save(os.path.join(folder_name, filename), quality=quality)
except Exception as e:
pass
| Issue | Cause |
|---|---|
| The scheduler runs twice, causing the AI to generate hashtags twice (token waste). | When starting the server using subprocess, two processes are executed simultaneously. |
- Use jobstores to store scheduler logs in the DB and prevent execution if a scheduler with the same ID already exists.
⇒ Duplicate execution occurs, with two IDs being created (failed).
scheduler.add_jobstore(jobstores.DjangoJobStore(), "default") # Save to Django DB
- Set a scheduler execution status flag as a global variable to check execution status and prevent duplicate execution.
⇒ Duplicate execution still occurs (failed).
scheduler_running = False # Adding a Global Variable Flag
start_scheduler():
scheduler_running = True
- When starting the server using subprocess, two processes (1. performance lookup, reviews, recommendations, etc. / 2. API DB storage, etc.) are running simultaneously, which causes the start_scheduler inside ready to be executed twice.
Add code to the ready method to prevent duplicate execution.
- Before
def ready(self):
start_scheduler()
- After
def ready(self):
if os.environ.get('RUN_MAIN', None) is not None: # Run scheduler if RUN_MAIN is 'true'
print(' RUN_MAIN :', os.environ.get('RUN_MAIN', None))
from .tasks import start_scheduler
start_scheduler()
| Issue | Cause | Solution |
|---|---|---|
| The scheduler is not functioning in the deployment environment. | The code in the deployment environment is blocking the scheduler. | Comment out the conditional statement in the deployment environment and execute it. |