Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 5ef2c67

Browse files
fengmk2claudegemini-code-assist[bot]
authored
chore(benchmark): add CPU profiler analysis tools (#917)
Add scripts and reports for analyzing V8/xprofiler CPU profiles: - analyze-profile.js: Comprehensive CPU profile analyzer - hotspot-finder.js: Find specific hotspots with filtering - call-tree-analyzer.js: Analyze call relationships between layers - flamegraph-convert.js: Convert to folded stack format for flame graphs - REPORT.md: Analysis findings showing Leoric Bone constructor as main hotspot - CALL-DIAGRAM.md: Visual call relationship diagram Key findings: Leoric ORM Bone constructor consumes 15.38% of active CPU time, while application code only uses 2.18%. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Signed-off-by: MK (fengmk2) <[email protected]> Co-authored-by: Claude <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
1 parent 37d043c commit 5ef2c67

File tree

7 files changed

+1331
-0
lines changed

7 files changed

+1331
-0
lines changed

benchmark/profiler/CALL-DIAGRAM.md

Lines changed: 167 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,167 @@
1+
# Application → Leoric Bone Call Relationship Diagram
2+
3+
## Summary
4+
5+
The Leoric `Bone` constructor (15.38% of CPU) is called through these application layer entry points:
6+
7+
| Rank | Application Entry Point | Hits | File |
8+
|------|------------------------|------|------|
9+
| 1 | `convertEntityToModel` | 141 | ModelConvertor.js:8 |
10+
| 2 | `saveEntityToModel` | 24 | ModelConvertor.js:50 |
11+
| 3 | `syncPackage` | 9 | PackageSearchService.js:16 |
12+
| 4 | `(anonymous)` in findAllVersions | 8 | PackageVersionRepository.js:57 |
13+
| 5 | `convertModelToEntity` | 7 | ModelConvertor.js:74 |
14+
| 6 | `findBinary` | 4 | BinaryRepository.js:27 |
15+
16+
**Note**: 1,743 hits (most of the Bone calls) come from paths without direct application code - triggered by async operations and internal leoric queries.
17+
18+
## Call Flow Diagram
19+
20+
```mermaid
21+
flowchart TD
22+
subgraph App["Application Layer (2.18% CPU)"]
23+
A1["convertEntityToModel<br/>ModelConvertor.js:8<br/>141 hits"]
24+
A2["saveEntityToModel<br/>ModelConvertor.js:50<br/>24 hits"]
25+
A3["syncPackage<br/>PackageSearchService.js<br/>9 hits"]
26+
A4["convertModelToEntity<br/>ModelConvertor.js:74<br/>7 hits"]
27+
A5["findBinary<br/>BinaryRepository.js<br/>4 hits"]
28+
end
29+
30+
subgraph Repo["Repository Layer"]
31+
R1["BinaryRepository.saveBinary"]
32+
R2["TaskRepository.saveTask"]
33+
R3["PackageVersionRepository.findAllVersions"]
34+
R4["BinaryRepository.listBinaries"]
35+
end
36+
37+
subgraph Leoric["Leoric ORM (24% CPU)"]
38+
L1["Bone.create()"]
39+
L2["Bone.save()"]
40+
L3["Bone.findOne()"]
41+
L4["Bone.find()"]
42+
L5["instantiate()"]
43+
L6["dispatch()"]
44+
BONE["🔥 Bone Constructor<br/>bone.js:150<br/>1553 hits (15.38%)"]
45+
end
46+
47+
A1 --> R1
48+
R1 --> L1
49+
L1 --> BONE
50+
51+
A2 --> R2
52+
R2 --> L2
53+
L2 --> BONE
54+
55+
A3 --> L3
56+
L3 --> L5
57+
L5 --> L6
58+
L6 --> BONE
59+
60+
A4 --> R4
61+
R4 --> L4
62+
L4 --> L5
63+
64+
A5 --> L3
65+
66+
style BONE fill:#ff6b6b,stroke:#c92a2a,stroke-width:3px,color:#fff
67+
style App fill:#d3f9d8,stroke:#2b8a3e
68+
style Leoric fill:#ffe3e3,stroke:#c92a2a
69+
```
70+
71+
## Detailed Call Paths
72+
73+
### Path 1: Entity Creation (141 hits - Highest)
74+
75+
```
76+
BinarySyncerService.saveBinaryItem()
77+
└── BinaryRepository.saveBinary()
78+
└── ModelConvertor.convertEntityToModel()
79+
└── Bone.create()
80+
└── ContextModelClass()
81+
└── 🔥 Bone() constructor [132 hits]
82+
```
83+
84+
**This is the hottest path.** Every time a new entity is saved to the database, the `Bone` constructor is called.
85+
86+
### Path 2: Entity Update (24 hits)
87+
88+
```
89+
ChangesStreamService / TaskService
90+
└── TaskRepository.saveTask()
91+
└── ModelConvertor.saveEntityToModel()
92+
└── Bone.save()
93+
└── Bone._save()
94+
└── Bone.update()
95+
└── Bone._update() [7 hits]
96+
└── Bone.changes()
97+
└── deep-equal checks [expensive]
98+
```
99+
100+
### Path 3: Database Query Results (1553+ hits - Main Hotspot)
101+
102+
```
103+
Any Repository.find*() method
104+
└── Leoric Spell.then()
105+
└── ignite()
106+
└── dispatch()
107+
└── instantiate()
108+
└── ContextModelClass()
109+
└── 🔥 Bone() constructor [1553 hits]
110+
```
111+
112+
**This is where most CPU time is spent.** Every row returned from the database triggers a `Bone` constructor call to instantiate the ORM model.
113+
114+
## Root Cause Analysis
115+
116+
The `Bone` constructor is expensive because it:
117+
118+
1. **Initializes all attribute accessors** - Creates getters/setters for each column
119+
2. **Sets up change tracking** - Prepares for dirty checking
120+
3. **Validates attributes** - Runs type checking on initialization
121+
4. **Uses deep-equal** - Expensive comparison for change detection
122+
123+
## Optimization Recommendations
124+
125+
### 1. Batch Operations
126+
When inserting many records, use bulk insert instead of individual creates:
127+
```typescript
128+
// Instead of:
129+
for (const entity of entities) {
130+
await Model.create(entity); // Each calls Bone constructor
131+
}
132+
133+
// Use:
134+
await Model.bulkCreate(entities); // Single operation
135+
```
136+
137+
### 2. Raw Queries for Read-Heavy Operations
138+
For read operations that don't need full ORM features:
139+
```typescript
140+
// Instead of:
141+
const records = await Model.find({ where: {...} }); // Creates Bone instances
142+
143+
// Consider:
144+
const records = await Model.query('SELECT * FROM ...', { raw: true }); // Plain objects
145+
```
146+
147+
### 3. Select Only Needed Columns
148+
```typescript
149+
// Instead of:
150+
await Model.findOne({ where: {...} }); // Loads all columns
151+
152+
// Use:
153+
await Model.findOne({ where: {...}, select: ['id', 'name'] }); // Fewer attributes to initialize
154+
```
155+
156+
### 4. Consider Leoric Configuration
157+
Check if leoric has options to:
158+
- Disable change tracking for read-only queries
159+
- Use lazy attribute initialization
160+
- Skip validation on trusted data
161+
162+
## Files to Review
163+
164+
1. `app/repository/util/ModelConvertor.ts` - Main entity/model conversion
165+
2. `app/repository/BinaryRepository.ts` - Heavy on creates
166+
3. `app/core/service/BinarySyncerService.ts` - Triggers many entity creations
167+
4. Any repository with high query volume

benchmark/profiler/REPORT.md

Lines changed: 182 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,182 @@
1+
# CPU Profile Analysis Report
2+
3+
- **Profile**: `x-cpuprofile-3000679-20251207-0.cpuprofile`
4+
- **Date**: 2025-12-07
5+
- **Duration**: 180.02 seconds (3 minutes)
6+
- **Profile Type**: `xprofiler-cpu-profile`
7+
8+
## Executive Summary
9+
10+
This CPU profile was captured from a cnpmcore production instance. The profile shows the application is mostly idle (90% of the time), with about 6% active CPU usage during the 3-minute sampling period.
11+
12+
### Key Findings
13+
14+
1. **Leoric ORM is the #1 CPU consumer** - 24% of active CPU time is spent in leoric (ORM library)
15+
2. **The `Bone` constructor is the main hotspot** - Taking 15.38% of active CPU time alone
16+
3. **deep-equal operations in leoric are expensive** - Type checking functions (`is-string`, `is-number-object`, `is-array-buffer`) consume significant CPU
17+
4. **Application code is very efficient** - Only 2.18% of CPU time is in application code
18+
19+
## CPU Time Distribution
20+
21+
| Category | Samples | % of Total |
22+
|----------|---------|------------|
23+
| Idle | 151,070 | 90.09% |
24+
| GC (Garbage Collection) | 4,888 | 2.91% |
25+
| Active/User Code | 10,098 | 6.02% |
26+
| Program | 1,641 | 0.98% |
27+
28+
## Active CPU Time Breakdown
29+
30+
| Category | Samples | % of Active |
31+
|----------|---------|-------------|
32+
| NPM Packages | 4,085 | 40.45% |
33+
| Native/V8 | 2,975 | 29.46% |
34+
| Node.js Core | 2,818 | 27.91% |
35+
| Application Code | 220 | 2.18% |
36+
37+
## Top Performance Bottlenecks
38+
39+
### 1. Leoric ORM - `Bone` Constructor (15.38%)
40+
41+
The single biggest CPU consumer is the `Bone` constructor in leoric ORM.
42+
43+
**Location**: `node_modules/[email protected]@leoric/lib/bone.js:150`
44+
45+
**Call paths**:
46+
- Database query results → `instantiate()``dispatch()``Bone()`
47+
- Entity creation → `create()``Bone()`
48+
49+
**Recommendation**:
50+
- Consider lazy instantiation for bulk queries
51+
- Review if all Bone properties need to be initialized upfront
52+
- Consider upgrading leoric if newer versions have optimizations
53+
54+
### 2. Deep Equality Checks in Leoric (2.5%)
55+
56+
The `changes()` function in leoric uses `deep-equal` which triggers expensive type checking:
57+
58+
| Function | Samples | % |
59+
|----------|---------|---|
60+
| tryStringObject (is-string) | 68 | 0.67% |
61+
| isArrayBuffer | 51 | 0.50% |
62+
| tryNumberObject | 45 | 0.45% |
63+
| booleanBrandCheck | 51 | 0.50% |
64+
| isSharedArrayBuffer | 37 | 0.37% |
65+
66+
**Recommendation**:
67+
- Check if leoric has an option to skip change detection
68+
- For bulk inserts, consider using raw SQL queries
69+
- Review if `deep-equal` can be replaced with faster comparison
70+
71+
### 3. MySQL2 Driver (2.72%)
72+
73+
MySQL2 operations including result parsing:
74+
75+
| Function | Samples | % |
76+
|----------|---------|---|
77+
| column_definition.get | 56 | 0.55% |
78+
| query.start | 49 | 0.49% |
79+
| keyFromFields | 30 | 0.30% |
80+
81+
**Recommendation**:
82+
- These are normal database operations - no immediate action needed
83+
- Consider connection pooling optimization if not already configured
84+
85+
### 4. Network I/O (writev/writeBuffer) - 10.1%
86+
87+
Significant time spent in network I/O operations:
88+
89+
| Function | Samples | % |
90+
|----------|---------|---|
91+
| writev (native) | 1,037 | 10.27% |
92+
| writeBuffer (native) | 437 | 4.33% |
93+
94+
**Recommendation**:
95+
- This is expected for a registry that serves packages
96+
- Consider response compression if not enabled
97+
- Review if large payloads can be streamed
98+
99+
### 5. urllib JSON Parsing (0.31%)
100+
101+
**Location**: `node_modules/[email protected]@urllib/dist/esm/utils.js:25`
102+
103+
**Recommendation**:
104+
- Normal operation for HTTP client responses
105+
- Consider if some responses don't need JSON parsing
106+
107+
## Application Code Analysis
108+
109+
The application code is highly efficient. Top application hotspots:
110+
111+
| Function | File | Samples | % |
112+
|----------|------|---------|---|
113+
| syncPackage | PackageSearchService.js:16 | 22 | 0.22% |
114+
| convertModelToEntity | ModelConvertor.js:74 | 38 | 0.38% |
115+
| syncPackageWithPackument | PackageSyncerService.js:926 | 14 | 0.14% |
116+
| findBinary | BinaryRepository.js:27 | 7 | 0.07% |
117+
118+
**Observation**: The application code is well-optimized. Most CPU time is in third-party dependencies.
119+
120+
## Recommendations Summary
121+
122+
### High Priority
123+
124+
1. **Investigate Leoric Bone Constructor**
125+
- This is consuming 15.38% of active CPU time
126+
- Check if leoric has batch instantiation options
127+
- Consider lazy loading of entity properties
128+
- Profile specific queries to identify the most expensive ones
129+
130+
2. **Review deep-equal Usage**
131+
- The `changes()` function triggers expensive type checks
132+
- For bulk operations, consider skipping change detection
133+
- Explore if leoric supports simpler comparison strategies
134+
135+
### Medium Priority
136+
137+
3. **GC Optimization**
138+
- GC is at 2.91% which is reasonable but could be improved
139+
- Review object allocation patterns in hot paths
140+
- Consider object pooling for frequently created objects
141+
142+
4. **Network I/O Review**
143+
- writev operations are expected but at 10% worth monitoring
144+
- Ensure response streaming is properly configured
145+
- Review large payload handling
146+
147+
### Low Priority
148+
149+
5. **Keep Application Code Lean**
150+
- Application code is only 2.18% of CPU - excellent
151+
- Continue following current coding patterns
152+
153+
## Tools Created
154+
155+
The following analysis scripts have been created in `benchmark/profiler/`:
156+
157+
1. **analyze-profile.js** - Comprehensive CPU profile analyzer
158+
```bash
159+
node benchmark/profiler/analyze-profile.js path/to/profile.cpuprofile
160+
```
161+
162+
2. **hotspot-finder.js** - Find specific hotspots with filtering
163+
```bash
164+
node benchmark/profiler/hotspot-finder.js profile.cpuprofile --filter=leoric --top=20
165+
```
166+
167+
3. **flamegraph-convert.js** - Convert to folded stack format for flame graphs
168+
```bash
169+
node benchmark/profiler/flamegraph-convert.js profile.cpuprofile > stacks.txt
170+
```
171+
172+
## Viewing the Profile
173+
174+
The `.cpuprofile` file can be viewed in:
175+
176+
1. **Chrome DevTools**: Open `chrome://inspect` → Open dedicated DevTools → Performance tab → Load
177+
2. **speedscope.app**: Upload the file directly at https://www.speedscope.app/
178+
3. **VS Code**: Install "vscode-js-profile-flame" extension
179+
180+
## Conclusion
181+
182+
The cnpmcore application is well-optimized with only 2.18% of CPU time in application code. The main optimization opportunity is in the leoric ORM layer, specifically the `Bone` constructor which consumes 15.38% of active CPU time. The GC time at 2.91% is reasonable for a Node.js application of this complexity.

0 commit comments

Comments
 (0)