Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
3 views6 pages

Data Manipulation Topics List

Uploaded by

bistbhupe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views6 pages

Data Manipulation Topics List

Uploaded by

bistbhupe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

✅ Ultimate Data Manipulation Checklist (Language-Agnostic)

We'll group these manipulations into core categories, with examples relevant across Python and
other languages.

Practice and make personalize syntax for all 6 data types of R and 5 data structures of R.

🧠 1. Type Awareness
✅ Identify the data type/class (type(x) in Python, typeof in JS, instanceof in Java, etc.)

✅ Check for None / null / undefined values

✅ Understand mutability (mutable vs. immutable types)

🔁 2. Type Conversion (Casting)


✅ Convert between int, float, string, bool

✅ Convert between collections: list ⬌ tuple ⬌ set ⬌ dict

✅ Serialize/deserialize (e.g., JSON, binary, XML)

✅ Convert to/from user-defined types (classes/objects)

✍️3. Value Manipulation


✅ Replace or update values

✅ Insert or append new values

✅ Delete or remove values (by value, key, or index)

✅ Rename the column, row or variables of the data set.

✅ Merge or extend values (e.g., concatenation, set union)

📐 4. Structural Analysis
✅ Get length, size, or count of items

✅ Access by index/key
✅ Find first/last/occurrence of values

✅ Check for existence (in, hasKey, etc.)

✅ Loop/iterate through values

🔤 5. String/Text Operations
✅ Change case (upper, lower, title, capitalize)

✅ Split and join strings

✅ Trim/strip whitespace or characters

✅ Replace substrings

✅ Search (startsWith, endsWith, contains, regex)

✅ Count characters or words

✅ Format strings (f-strings, templates)

🔢 6. Arithmetic and Logical Manipulation


✅ Basic math operations (+, -, *, /, **)

✅ Aggregates: sum, min, max, avg

✅ Boolean logic: AND, OR, NOT

✅ Comparisons: <, >, ==, !=

📚 7. Indexing and Slicing


✅ Access by index (positive and negative)

✅ Slice ranges (start, stop, step)

✅ Nested indexing (e.g., 2D arrays, dict of dicts)

✂️8. Insertion, Deletion, and Slicing (Collections)


✅ Insert at position

✅ Remove by value/index/key

✅ Pop elements

✅ Clear the whole collection

✅ Subsets (slice, take, drop, filter)

🔎 9. Search and Filter


✅ Find values matching conditions

✅ Check existence (contains, in)

✅ Filter by condition (e.g., filter() in Python)

✅ Index/position of items

📈 10. Ordering and Arrangement


✅ Sort ascending/descending

✅ Reverse order

✅ Shuffle/randomize

✅ Custom sort (e.g., by length or nested value)

🔗 11. Joining and Splitting Collections


✅ Concatenate/merge lists, sets, dicts

✅ Join strings from lists

✅ Zip/unzip lists

✅ Chunk/split large collections

🔄 12. Mapping and Transformation


✅ Apply function to each item (map)
✅ List/set/dict comprehensions (Python)

✅ Transform key/value pairs in dicts

✅ Encode/decode values (e.g., base64, URL, HTML)

🧩 13. Set Operations


✅ Union

✅ Intersection

✅ Difference

✅ Symmetric difference

✅ Subset creation with or without conditions of a dataset, subset/superset checks

🧰 14. Advanced Utilities


✅ Deep copy vs shallow copy

✅ Flatten nested structures

✅ Grouping and aggregation

✅ Recursion/iteration for nested data

✅ Handling missing/null/undefined

✅ Data validation and cleaning

📦 15. Object and Class Manipulation (if OOP)


✅ Access properties and methods

✅ Add/remove attributes

✅ Use introspection/reflection

✅ Inheritance/polymorphism considerations

✅ Serialization/deserialization of objects
🧠 Bonus: Meta-Manipulation
✅ Measure memory/space usage

✅ Track performance/timing

✅ Immutable vs mutable manipulation strategies

✅ Thread-safe or async-safe manipulation

Mapped to General Concepts + Python + R + SQL


# 📌 Concept 🔄 General Description 🐍 Python 📊R SQL

SELECT typeof(column)
1 Type Checking Know the type of a value type(x) class(x), typeof(x)
(SQLite)

2 Type Conversion Convert between types int("3"), str(3) as.numeric("3") CAST(column AS INT)

Value Assignment /
3 Change or set value lst[0] = 5 vec[1] <- 5 UPDATE table SET col = 5
Mutation

4 Count / Length Get number of items len(x) length(x) / nchar() COUNT(*)

5 Text Case Manipulation Capitalize, lowercase, titlecase "abc".upper() toupper("abc") UPPER(column)

6 Indexing / Position Access Access element by position s[0], lst[2] vec[1], list[[1]] N/A (can use LIMIT, OFFSET)

Perform calculations,
7 Arithmetic / Logic a + b, a > b a + b, a > b SELECT a + b
comparisons

SELECT * WHERE, LIMIT,


8 Slicing / Subsets Get sub-parts of data lst[1:3], s[1:3] vec[1:3], subset()
SUBSTRING()

9 Pattern Matching / Search Find by pattern or rule "abc".startswith("a") grepl("^a", str) LIKE 'a%', REGEXP

10 Counting Occurrences Frequency of values lst.count(3) sum(vec == 3) SELECT COUNT(*) WHERE ...

Appending / Adding
11 Add item(s) to a collection lst.append(4) c(vec, 4) INSERT INTO table VALUES (...)
Values

12 Sorting Order data sorted(lst) sort(vec) ORDER BY

13 Reversing Reverse item order lst[::-1], reverse() rev(vec) ORDER BY col DESC

Convert between list, set,


14 Collection Conversion list(), tuple(), set() as.list(), unlist() N/A (done via table structures)
tuple, dict

15 Object/Field Access Access properties of an object obj.attr obj$field / S3/S4 slots Table.column

Filtering / Conditional SELECT * FROM table WHERE


16 Filter rows/values by condition [x for x in lst if x > 5] vec[vec > 5]
Selection col > 5

17 Aggregation / Sum, average, min, max sum(lst), max(lst) sum(vec), mean(vec) SUM(col), AVG(col)
# 📌 Concept 🔄 General Description 🐍 Python 📊R SQL

Summarization

18 Joining / Merging Combine multiple data sources pd.merge(df1, df2) merge(df1, df2) JOIN

Group by a category and group_by() +


19 Grouping groupby() (Pandas) GROUP BY
summarize summarize()

is None, if x is not
20 Missing Data Handling Handle None, NA, NULL is.na(x) IS NULL, COALESCE()
None:

21 Removing Items / Values Delete values del lst[0], remove(x) vec[-1], subset() DELETE FROM table WHERE ...

22 Deduplication Remove duplicates set(lst), dict.fromkeys() unique(vec) SELECT DISTINCT

23 Renaming Items Rename columns, variables df.rename() (Pandas) rename() (dplyr) AS keyword

pivot_longer(), PIVOT, UNPIVOT (SQL Server,


24 Reshaping Pivot, melt, transpose pivot(), melt() (Pandas)
spread() Oracle)

Combining / Combine multiple elements or


25 " ".join(lst), + paste(), bind_rows() UNION, CONCAT()
Concatenation datasets

26 Splitting Break into parts "a,b".split(",") strsplit() SPLIT_PART() (PostgreSQL)

Cursors or iterative SQL


27 Iteration Loop through elements for x in lst: for (x in vec) / lapply()
procedures

Not native — needs subquery


28 Function Application Apply function to items map(func, lst) lapply(), sapply()
or UDF

Nested Structures
29 Access deep elements dict['key']['sub'] list[[1]][[2]] Not typical in SQL
Handling

30 Data Structure Creation Define data structures [], {}, () c(), list(), data.frame() CREATE TABLE, INSERT

You might also like