Codestin Search App

Project name

Culturepedia

Introduction

It is a platform that allows you to conveniently explore cultural content containing performances of various genres, provides personalized recommendations by analyzing user tastes, and helps you make better choices through in-depth reviews.

Development Period

Planning and Documentation: 2024.09.23 ~ 2024.09.27
Development: 2024.09.27 ~ 2024.10.23

Team Roles and Responsibilities

API	FEATURES	FRONTEND	BACKEND
ACCOUNTS	Sign up	김상아	김채림
	Log in	김상아, 김채림	김채림
	Log out	김상아	김상아
	Profile view	김채림	김상아
	Profile Edit	김상아	김상아
	Account Deletion	김채림	김채림
PERFORMANCES	Data Pipeline Construction	-	이광욱, 김상아
	KOPIS API Crawling	이광욱	김채림
	Search	이광욱	김상아
	Detail Page	이광욱	김상아
	Favorites	김채림	김상아
	Reviews	김채림	김상아
	Category-Based Recommendations	김채림	김채림
	Cookie and Session Storage	김채림, 이광욱	김채림
	AI Hashtaging	-	이광욱
	Scheduling	-	이광욱
	Image Resizing	-	이광욱
Deployment	Deployment	김상아
Debugging	Debugging, Improvements Based on UT (Unit Testing)	공통

Full Technology Stack Overview

Backend: Python, Django REST Framework (DRF), LLM
Frontend: HTML, CSS, JS, Bootstrap
Database: SQLite, PostgreSQL
Version Control: Github, AWS
IDE: VSCode
APIs Used: KOPIS API (Korea Performance Information System), OpenAI API
Other Tools and Libraries:
- APScheduler==3.10.4
- asgiref==3.8.1
- certifi==2024.8.30
- charset-normalizer==3.3.2
- Django==4.2
- django-apscheduler==0.7.0
- django-seed==0.3.1
- djangorestframework==3.15.2
- djangorestframework-simplejwt==5.3.1
- ElementTreeFactory==1.0
- Faker==30.1.0
- idna==3.10
- pillow==10.4.0
- psycopg2==2.9.9
- PyJWT==2.9.0
- python-dateutil==2.9.0.post0
- pytz==2024.2
- requests==2.32.3
- six==1.16.0
- sqlparse==0.5.1
- toposort==1.10
- typing_extensions==4.12.2
- tzdata==2024.2
- tzlocal==5.2
- urllib3==2.2.3
- xmltodict==0.13.0

Key Features

Accounts

Sign up
- Users can create an account by providing mandatory information (Email, Password, Username) and optional details (Gender, Birthday). After signing up, users can access features like favorites, writing reviews, and receiving performance recommendations.
Log in
- Users can log in using their email and password. Authentication is handled via JWT (JSON Web Token).
Log Out
- Users can log out, and their JWT token will be invalidated.
Profile Edit
- Users can update their username, password, gender, and birthday.
Account Deletion
- Users can delete their account by entering their password. This removes their data from the system.
Profile View: Users can view their profile details (email, username, gender, birthday) and access their performance viewing history.
- Viewing History
  - Users can see the performances they have reviewed.
- Favorites
  - Users can view a list of performances they have marked as favorites.

Performances

Browse Performances
- Users can browse performance rankings by category, sort them by sales or newest, and filter performances by region. Clicking on a performance poster or title takes users to the performance’s detail page.
Search Performances
- Users can search performances by keywords (title, actor, venue, production company).
Performance Details
- Users can view detailed information about a performance, including the title, start and end dates, venue, cast, crew, runtime, age restrictions, producers, pricing, poster, synopsis, genre, and status.
Write Reviews
- Users can write reviews for performances they have attended, including a star rating out of 5.
Edit Reviews
- Users can edit reviews they have written.
Delete Reviews
- Users can delete reviews they have written.
Add to Favorites
- Users can add performances to their favorites list.
Remove from Favorites
- Users can remove performances from their favorites list.
Performance Recommendations
- Based on the user’s reviews (ratings of 3 stars or higher), favorites, and search tags, the system recommends personalized performances. If there’s insufficient user data, users can manually choose categories, characteristics, moods, and regions for recommendations.

Requirements

Backend Requirements (Python 3.10, Django REST Framework)

Sign Up (accounts)
- Allows new users to register.
Log In (signin)
- Users can log in using their email and password.
Log Out (signout)
- Users can log out and terminate their session.
Profile Edit (modify)
- Users can update their profile information.
Account Deletion (delete)
- Users can delete their account.
Profile View (profile)
- Displays the user's profile.
Performance List View (performances)
- All users can view performance data.
Performance Search (search)
- Performances can be searched by title, actor, venue, or production company.
Performance Details View (detail)
- Displays detailed information about a performance.
Performance Reviews (create/edit/delete)
- Users can write, edit, and delete reviews for performances.
Add/Remove Favorites
- Users can add performances to or remove them from their favorites list.

Frontend Requirements (HTML, CSS, Bootstrap)

UI/UX Design
- Build the user interface (UI) using Bootstrap.
Sign Up/Log In/Log Out Pages
- Pages for managing user accounts.
Performance Review and Favorites Pages
- Provide UI for users to write reviews and manage their favorite performances.
Profile Page
- Implement a page where users can view and edit their profile information.

Database (SQLite, PostgreSQL)

Review Table
- Stores review data for each performance.
Favorites Table
- Stores a list of performances favorited by users.
User Table
- Stores user information and profiles.

API Requirements

KOPIS API
- Provides performance lists and detailed performance information.
OpenAI API
- Utilizes LLM to offer additional AI-powered features.

Service Structure

WireFrame

https://www.figma.com/design/bc7ezCAoLV0OBzk35nNMSQ/Culturepedia-WireFrame?m=auto&t=M5PEKIqkya6poh6A-1

API Documentation

https://documenter.getpostman.com/view/38012126/2sAY4rE4t4

ERD

Folder Structure

CULTUREPEDIA
│
├── accounts/              # Handles user authentication and permissions
├── culturepedia/          # Project configuration files (e.g., settings.py)
├── performances/          # Contains features for performance browsing, searching, and recommendations
│   └── bots.py            # Handles interactions with the OpenAI API
│   └── tasks.py           # Scheduler for saving data from the KOPIS API to the local database
├── static/                # Static files (CSS, images, JavaScript, HTML) served to the client for faster loading
│   ├── css/               # Stylesheets that define the look and feel of the website
│   ├── img/               # Performance-related images used across the site
│   ├── js/                # Client-side scripts that handle dynamic interactions on the web pages
├── facility.py            # Stores detailed data about performance venues from the KOPIS API
├── kopis_api_detail.py    # Stores detailed data about specific performances from the KOPIS API
├── kopis_api.py           # Stores performance list data from the KOPIS API
├── db.sqlite3             # SQLite database file
├── .gitignore             # Specifies which files to ignore in version control
├── manage.py              # Django project management script
├── requirements.txt       # List of required Python packages

Trouble Shooting

API DB vs Server DB : Storing large amounts of performance data locally can put a strain on the server, so it is recommended to query the API DB as it updates in real time.

API DB

Issue	Cause	Solution
Missing data during lookup and search, and server load.	Instead of searching for all performances, only a limited number of performances per page are searched.	Increase the number of performances ( rows) retrieved per page and limit the duration of the performance (start_date, end_date). Caution : If there are many performances and periods, inquiry and search times can be greatly increased.

Due to the complicated code and traffic problems, it has been changed to store it in a local DB to inquire and retrieve it.

Server DB

Issue	Cause	Solution
Missing results due to performance lookup and search failures	Not all performances are stored, only those within a certain period.	It stores data periodically from the beginning of development, excluding past performance. This mitigates traffic problems when querying and retrieving from API DB.
The performance API is in XML format, which raises concerns about storing the data in Django.	It was not a json file format learned while using an existing django.	Use the 'xml toict' module to parse and convert XML into JSON format.
Performance details in JSON format are not stored in the DB.	The performance venue is linked via a foreign key in the model.	Fix the upload order so that the venue is saved first, followed by the performance details.
As time passes, performance information needs to be updated.	The API DB is always up-to-date, so this issue didn’t arise there.	Initially considered using triggers based on the performance period (start_date, end_date) to update the performance state, but realized that other detailed information might also need updates. Decided to compare the local DB and API DB every midnight for updates using a scheduler and subprocess.
Subprocess is not being executed.	Subprocess is referencing a different Python environment.	Specify the Python virtual environment path.
Scheduler is being executed multiple times.	No code exists to prevent duplicate executions.	Write code to check if the scheduler is already running.
Data is not being updated automatically.	The code that saves performances specifies the date and page, preventing automatic updates.	Modify the code to retrieve all pages starting from the current date using a while loop.

Deploy

Issue	Cause	Solution
Network error (connection timed out) when attempting to connect to the server.	Subnet issue → Network ACL check → Allow/Deny → Marked as (X) Deny, which caused the problem.	Edit Network ACL settings → Inbound rules → Add new rule → Set to Allow.

OPENAI API

Issue	Cause	Solution
Sending an image URL to OpenAI sometimes results in successful analysis, and sometimes not.	OpenAI does not seem to recognize the URL properly.	Download and resize the image within the project, then analyze the resized image.

def resizing_images(url_list, quality=85):      # quaility: 70 ~ 90, default: 85
    
    img_folder = os.path.join(settings.STATICFILES_DIRS[0], "img")
    if not os.path.exists(img_folder):
        os.makedirs(img_folder)
    
    for url in url_list:
        try:
            # Download an image from the URL
            response = requests.get(url)
            response.raise_for_status()  # HTTP 요청 에러 체크

            # Image File Name Extract
            filename = os.path.basename(url)

            # Create a folder by performance.
            folder_name = f"{img_folder}/{filename.split('_')[1]}"
            if not os.path.exists(folder_name):
                os.mkdir(folder_name)

            # Open the image in memory.
            im = Image.open(BytesIO(response.content))
            
            # Save Image
            im.save(os.path.join(folder_name, filename), quality=quality)
            
        except Exception as e:
            pass

Issue	Cause	Solution
An AI was created to extract hashtags using given field values, but the hashtag_list contains unnecessary content in addition to the hashtags.	The gpt-4o-mini model outputs unnecessary messages, causing incorrect behavior.	1. Use the gpt-4o model instead of gpt-4o-mini. Result: When entering the same query, no unnecessary messages are output. Drawback: Relatively more expensive. 2. Continue using gpt-4o-mini but modify the query (e.g., "Please provide hashtags." ⇒ "Just provide the hashtags."). Result: Only hashtags are output without any unnecessary messages.

Issue	Cause
The scheduler runs twice, causing the AI to generate hashtags twice (token waste).	When starting the server using subprocess, two processes are executed simultaneously.

Solution

Use jobstores to store scheduler logs in the DB and prevent execution if a scheduler with the same ID already exists.
⇒ Duplicate execution occurs, with two IDs being created (failed).

scheduler.add_jobstore(jobstores.DjangoJobStore(), "default")  # Save to Django DB

Set a scheduler execution status flag as a global variable to check execution status and prevent duplicate execution.
⇒ Duplicate execution still occurs (failed).

scheduler_running = False  # Adding a Global Variable Flag
start_scheduler():
	scheduler_running = True

When starting the server using subprocess, two processes (1. performance lookup, reviews, recommendations, etc. / 2. API DB storage, etc.) are running simultaneously, which causes the start_scheduler inside ready to be executed twice.

conclusion

Add code to the ready method to prevent duplicate execution.

Before

def ready(self):
	start_scheduler()

After

def ready(self):
	if os.environ.get('RUN_MAIN', None) is not None:    # Run scheduler if RUN_MAIN is 'true'
		print(' RUN_MAIN :', os.environ.get('RUN_MAIN', None))
		from .tasks import start_scheduler
		start_scheduler()

Issue	Cause	Solution
The scheduler is not functioning in the deployment environment.	The code in the deployment environment is blocking the scheduler.	Comment out the conditional statement in the deployment environment and execute it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Project name

Introduction

Development Period

Team Roles and Responsibilities

Full Technology Stack Overview

Key Features

Accounts

Performances

Requirements

Backend Requirements (Python 3.10, Django REST Framework)

Frontend Requirements (HTML, CSS, Bootstrap)

Database (SQLite, PostgreSQL)

API Requirements

Service Structure

WireFrame

API Documentation

ERD

Folder Structure

Trouble Shooting

API DB

Server DB

Deploy

OPENAI API

Solution

conclusion

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 242 Commits
accounts		accounts
culturepedia		culturepedia
performances		performances
static		static
.gitignore		.gitignore
README.md		README.md
facility_api.py		facility_api.py
kopis_api.py		kopis_api.py
kopis_api_detail.py		kopis_api_detail.py
manage.py		manage.py
requirements.txt		requirements.txt

kim-sangah/culturepedia

Folders and files

Latest commit

History

Repository files navigation

Project name

Introduction

Development Period

Team Roles and Responsibilities

Full Technology Stack Overview

Key Features

Accounts

Performances

Requirements

Backend Requirements (Python 3.10, Django REST Framework)

Frontend Requirements (HTML, CSS, Bootstrap)

Database (SQLite, PostgreSQL)

API Requirements

Service Structure

WireFrame

API Documentation

ERD

Folder Structure

Trouble Shooting

API DB

Server DB

Deploy

OPENAI API

Solution

conclusion

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages