Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 2400cb0

Browse files
Copilotjosix
andcommitted
Regenerate CSV files with proper Python terminology and consolidation approach
Co-authored-by: josix <[email protected]>
1 parent f722995 commit 2400cb0

File tree

3 files changed

+336
-17940
lines changed

3 files changed

+336
-17940
lines changed

TERMINOLOGY_DICTIONARY.md

Lines changed: 22 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -18,20 +18,20 @@ The complete terminology dictionary containing important terms identified from P
1818
- **directory**: Directory of the source file
1919
- **example_files**: List of up to 5 files containing this term
2020

21-
Total entries: ~14,700 unique terms
21+
Total entries: ~196 essential Python terms
2222

2323
### focused_terminology_dictionary.csv
24-
A curated subset of ~2,900 terms focusing on the most important Python terminology. Includes additional columns:
24+
A curated subset of ~118 terms focusing on the most important Python terminology. Includes additional columns:
2525
- **priority**: High/Medium priority classification
2626
- **category**: Term classification
2727

2828
#### Categories:
2929
- **Core Concepts** (7 terms): class, function, method, module, package, object, type
3030
- **Built-in Types** (9 terms): int, str, list, dict, tuple, set, float, bool, complex
31-
- **Keywords/Constants** (8 terms): None, True, False, return, import, def, async, await
32-
- **Exceptions** (690 terms): All *Error and *Exception terms
33-
- **Code Elements** (825 terms): Terms in backticks, magic methods
34-
- **Common Terms** (1,365 terms): Frequently used technical terms
31+
- **Keywords/Constants** (25 terms): None, True, False, return, import, def, async, await, and other Python keywords
32+
- **Exceptions** (29 terms): Common *Error and *Exception classes
33+
- **Code Elements** (14 terms): Magic methods like __init__, __str__, etc.
34+
- **Common Terms** (34 terms): Important technical concepts like decorator, generator, iterator
3535

3636
## Maintenance
3737

@@ -59,23 +59,21 @@ CSV files use UTF-8 encoding to properly handle Chinese characters. Compatible w
5959

6060
## Maintenance
6161

62-
### Adding New Patterns
63-
To extend pattern recognition, modify `extract_key_terms()` function in `extract_terminology.py`:
64-
65-
```python
66-
# Add new technical patterns
67-
tech_patterns = [
68-
r'\b(?:new_pattern_here)\b',
69-
# existing patterns...
70-
]
71-
```
72-
73-
### Adjusting Filters
74-
Modify filtering criteria in `is_significant_term()` and `create_focused_dictionary()` functions.
75-
76-
### Performance Optimization
77-
- Current processing: ~509 files in 2-3 minutes
78-
- Memory usage: ~50MB peak
79-
- Scalable to larger repositories
62+
### Adding New Terms
63+
New terms can be identified and added based on:
64+
- Frequency of appearance in documentation
65+
- Importance to Python concepts
66+
- Consistency needs across translation files
67+
68+
### Manual Curation Process
69+
The dictionaries are maintained through careful analysis of:
70+
- Core Python terminology in official documentation
71+
- Existing translation patterns in .po files
72+
- Category-based organization for translator efficiency
73+
74+
### Quality Assurance
75+
- Regular review of term translations for consistency
76+
- Cross-reference with official Python terminology
77+
- Validation against established translation conventions
8078

8179
This documentation provides comprehensive guidance for maintaining and using the translation dictionary system to ensure consistent, high-quality Python documentation translation.

0 commit comments

Comments
 (0)