✅ Ultimate Data Manipulation Checklist (Language-Agnostic)
We'll group these manipulations into core categories, with examples relevant across Python and
other languages.
Practice and make personalize syntax for all 6 data types of R and 5 data structures of R.
🧠 1. Type Awareness
✅ Identify the data type/class (type(x) in Python, typeof in JS, instanceof in Java, etc.)
✅ Check for None / null / undefined values
✅ Understand mutability (mutable vs. immutable types)
🔁 2. Type Conversion (Casting)
✅ Convert between int, float, string, bool
✅ Convert between collections: list ⬌ tuple ⬌ set ⬌ dict
✅ Serialize/deserialize (e.g., JSON, binary, XML)
✅ Convert to/from user-defined types (classes/objects)
✍️3. Value Manipulation
✅ Replace or update values
✅ Insert or append new values
✅ Delete or remove values (by value, key, or index)
✅ Rename the column, row or variables of the data set.
✅ Merge or extend values (e.g., concatenation, set union)
📐 4. Structural Analysis
✅ Get length, size, or count of items
✅ Access by index/key
✅ Find first/last/occurrence of values
✅ Check for existence (in, hasKey, etc.)
✅ Loop/iterate through values
🔤 5. String/Text Operations
✅ Change case (upper, lower, title, capitalize)
✅ Split and join strings
✅ Trim/strip whitespace or characters
✅ Replace substrings
✅ Search (startsWith, endsWith, contains, regex)
✅ Count characters or words
✅ Format strings (f-strings, templates)
🔢 6. Arithmetic and Logical Manipulation
✅ Basic math operations (+, -, *, /, **)
✅ Aggregates: sum, min, max, avg
✅ Boolean logic: AND, OR, NOT
✅ Comparisons: <, >, ==, !=
📚 7. Indexing and Slicing
✅ Access by index (positive and negative)
✅ Slice ranges (start, stop, step)
✅ Nested indexing (e.g., 2D arrays, dict of dicts)
✂️8. Insertion, Deletion, and Slicing (Collections)
✅ Insert at position
✅ Remove by value/index/key
✅ Pop elements
✅ Clear the whole collection
✅ Subsets (slice, take, drop, filter)
🔎 9. Search and Filter
✅ Find values matching conditions
✅ Check existence (contains, in)
✅ Filter by condition (e.g., filter() in Python)
✅ Index/position of items
📈 10. Ordering and Arrangement
✅ Sort ascending/descending
✅ Reverse order
✅ Shuffle/randomize
✅ Custom sort (e.g., by length or nested value)
🔗 11. Joining and Splitting Collections
✅ Concatenate/merge lists, sets, dicts
✅ Join strings from lists
✅ Zip/unzip lists
✅ Chunk/split large collections
🔄 12. Mapping and Transformation
✅ Apply function to each item (map)
✅ List/set/dict comprehensions (Python)
✅ Transform key/value pairs in dicts
✅ Encode/decode values (e.g., base64, URL, HTML)
🧩 13. Set Operations
✅ Union
✅ Intersection
✅ Difference
✅ Symmetric difference
✅ Subset creation with or without conditions of a dataset, subset/superset checks
🧰 14. Advanced Utilities
✅ Deep copy vs shallow copy
✅ Flatten nested structures
✅ Grouping and aggregation
✅ Recursion/iteration for nested data
✅ Handling missing/null/undefined
✅ Data validation and cleaning
📦 15. Object and Class Manipulation (if OOP)
✅ Access properties and methods
✅ Add/remove attributes
✅ Use introspection/reflection
✅ Inheritance/polymorphism considerations
✅ Serialization/deserialization of objects
🧠 Bonus: Meta-Manipulation
✅ Measure memory/space usage
✅ Track performance/timing
✅ Immutable vs mutable manipulation strategies
✅ Thread-safe or async-safe manipulation
Mapped to General Concepts + Python + R + SQL
# 📌 Concept 🔄 General Description 🐍 Python 📊R SQL
SELECT typeof(column)
1 Type Checking Know the type of a value type(x) class(x), typeof(x)
(SQLite)
2 Type Conversion Convert between types int("3"), str(3) as.numeric("3") CAST(column AS INT)
Value Assignment /
3 Change or set value lst[0] = 5 vec[1] <- 5 UPDATE table SET col = 5
Mutation
4 Count / Length Get number of items len(x) length(x) / nchar() COUNT(*)
5 Text Case Manipulation Capitalize, lowercase, titlecase "abc".upper() toupper("abc") UPPER(column)
6 Indexing / Position Access Access element by position s[0], lst[2] vec[1], list[[1]] N/A (can use LIMIT, OFFSET)
Perform calculations,
7 Arithmetic / Logic a + b, a > b a + b, a > b SELECT a + b
comparisons
SELECT * WHERE, LIMIT,
8 Slicing / Subsets Get sub-parts of data lst[1:3], s[1:3] vec[1:3], subset()
SUBSTRING()
9 Pattern Matching / Search Find by pattern or rule "abc".startswith("a") grepl("^a", str) LIKE 'a%', REGEXP
10 Counting Occurrences Frequency of values lst.count(3) sum(vec == 3) SELECT COUNT(*) WHERE ...
Appending / Adding
11 Add item(s) to a collection lst.append(4) c(vec, 4) INSERT INTO table VALUES (...)
Values
12 Sorting Order data sorted(lst) sort(vec) ORDER BY
13 Reversing Reverse item order lst[::-1], reverse() rev(vec) ORDER BY col DESC
Convert between list, set,
14 Collection Conversion list(), tuple(), set() as.list(), unlist() N/A (done via table structures)
tuple, dict
15 Object/Field Access Access properties of an object obj.attr obj$field / S3/S4 slots Table.column
Filtering / Conditional SELECT * FROM table WHERE
16 Filter rows/values by condition [x for x in lst if x > 5] vec[vec > 5]
Selection col > 5
17 Aggregation / Sum, average, min, max sum(lst), max(lst) sum(vec), mean(vec) SUM(col), AVG(col)
# 📌 Concept 🔄 General Description 🐍 Python 📊R SQL
Summarization
18 Joining / Merging Combine multiple data sources pd.merge(df1, df2) merge(df1, df2) JOIN
Group by a category and group_by() +
19 Grouping groupby() (Pandas) GROUP BY
summarize summarize()
is None, if x is not
20 Missing Data Handling Handle None, NA, NULL is.na(x) IS NULL, COALESCE()
None:
21 Removing Items / Values Delete values del lst[0], remove(x) vec[-1], subset() DELETE FROM table WHERE ...
22 Deduplication Remove duplicates set(lst), dict.fromkeys() unique(vec) SELECT DISTINCT
23 Renaming Items Rename columns, variables df.rename() (Pandas) rename() (dplyr) AS keyword
pivot_longer(), PIVOT, UNPIVOT (SQL Server,
24 Reshaping Pivot, melt, transpose pivot(), melt() (Pandas)
spread() Oracle)
Combining / Combine multiple elements or
25 " ".join(lst), + paste(), bind_rows() UNION, CONCAT()
Concatenation datasets
26 Splitting Break into parts "a,b".split(",") strsplit() SPLIT_PART() (PostgreSQL)
Cursors or iterative SQL
27 Iteration Loop through elements for x in lst: for (x in vec) / lapply()
procedures
Not native — needs subquery
28 Function Application Apply function to items map(func, lst) lapply(), sapply()
or UDF
Nested Structures
29 Access deep elements dict['key']['sub'] list[[1]][[2]] Not typical in SQL
Handling
30 Data Structure Creation Define data structures [], {}, () c(), list(), data.frame() CREATE TABLE, INSERT