Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

purarue
Copy link
Contributor

@purarue purarue commented May 31, 2024

parses all the MMS file/content parts, left comments alongside me exploring the data object

want to actually try consuming the output of this somewhere before I think it should be merged

also, there are some items parsed as MMS which are actually just text (like, theyre SMIL (XML-format) containers which contain a single text/plain file), with no pictures attached. So, its just a text thats stored in a MMS container, in an ideal world we can filter those out and add them to messages? But may not always work perfectly

So, we should probably at least note that mms can have text items, or... maybe combine them/have some helper .property methods on the MMS that give you nicer filtering/output?

will post some examples when I'm playing with this, gonna go sleep now

$ hpi doctor -S my.smscalls
✅ OK  : my.smscalls                                      
✅     - stats: {..... 'mms': {'count': 2491, 'first': datetime.datetime(2019, 12, 18, 4, 22, 46, tzinfo=datetime.timezone.utc), 'last': datetime.datetime(2024, 5, 29, 1, 31, 19, tzinfo=datetime.timezone.utc)}}

@purarue
Copy link
Contributor Author

purarue commented May 31, 2024

Probably instead of adding it to messages, we can just add helpers...? If there are multiple content parts, pretty common case is for there to be one text/plain and multiple image/jpeg attached.

Maybe a property that checks if theres only one part and its text, with no other parts attached

it might be nice to be able to handle the common mime types as well, like parsing images into PIL images

heres what I have:

$ hpi query my.smscalls.mms -s | jq '.content.[].content_type' -r | tally
      1 text/x-vcard
      4 video/3gpp
      5 image/gif
      7 text/x-vCard
     43 image/png
    429 image/jpeg
   2405 text/plain

Other todos:

  • Want to see if I can decode the target for the translated 'Liked an image'/'Liked a message' Apple stuff
  • need to check on the emitted key, I just copied the message one, not sure how accurate it is

@purarue
Copy link
Contributor Author

purarue commented May 31, 2024

On the Liked an Image stuff being transcribed, looks like a probably not

seems that stuff often breaks when things are re-imported or when you move sim cards, so its likely some translation happening in-app and its not saved perfectly in an export

Was looking to see if I could match any ids like in here:

<mms date="1648436193000" rr="129" sub="null" ct_t="application/vnd.wap.multipart.mixed" read_status="null" seen="1" msg_box="1" address="<REDACTED>" sub_cs="null" resp_st="null" retr_st="128" d_tm="null" text_only="1" exp="null" locked="0" m_id="mavodi-6-89-1e8-8-ba-628c370c-7d583aca2b" st="null" retr_txt_cs="null" retr_txt="null" creator="com.google.android.apps.messaging" date_sent="1648436191" read="1" m_size="243" rpt_a="null" ct_cls="null" pri="null" sub_id="2" tr_id="null" resp_txt="null" ct_l="<REDACTED>" m_cls="null" d_rpt="129" v="18" _id="740" m_type="132" readable_date="Mar 27, 2022 7:56:33 PM" contact_name="<REDACTED>"> <parts> <part seq="0" ct="text/plain" name="null" chset="3" cd="null" fn="null" cid="&lt;0&gt;" cl="null" ctt_s="null" ctt_t="null" text="Loved an image"/> </parts> </mms>

but my basic exploration of grepping IDs across the file doesnt seem to have worked. Theres lots of random key/values/ids though, maybe it could be indexes in conversation/some auto-indexed ID...? cant figure it out, just guessing

@purarue
Copy link
Contributor Author

purarue commented May 31, 2024

Ahhh, any group convo is converted into a MMS since otherwise it cant accurately encode who sent which message.

So thats why there are messages that are 'just text'

message_type is just # 1 = Received, 2 = Sent, 3 = Draft, 4 = Outbox, for group messages contact_name is just a list -- so you just know if you recieved it but dont know from who. The addresses described here https://www.synctech.com.au/sms-backup-restore/fields-in-xml-backup-files/

<addr address="<--->" type="130" charset="106"/>
<addr address="<--->" type="130" charset="106"/>
<addr address="<--->" type="130" charset="106"/>
<addr address="<--->" type="130" charset="106"/>
<addr address="<--->" type="137" charset="3"/>
<addr address="<--->" type="151" charset="106"/>

where it specifies which number and then a type tells you who actually sent it...

129 = BCC, 130 = CC, 151 = To, 137 = From ... weird schema.

@purarue purarue changed the title initial mms exploration smscalls: parse mms from smscalls export Jun 3, 2024
@purarue
Copy link
Contributor Author

purarue commented Jun 3, 2024

Was able to use it nicely in some scripts I have to preview convos, and save any images found to ~/.cache/sms-images, synced those scripts up here:

purarue/HPI-personal@d8b0539

$ sms-images
....
Saving /home/sean/.cache/sms-images/.../1716941561.0-IMG_7345.jpg.jpg
Done, saved 465 images, using 82.697 MB

am sure these some small issues I may have missed, but those will get found with more usage - I think this is good enough to merge and start using.

purarue added 5 commits June 2, 2024 17:31
lowers the chance that if the value is actually
"NULL" or 'Null' or something, it isn't misidentified
@purarue
Copy link
Contributor Author

purarue commented Jun 5, 2024

@karlicoss this should be good to review/merge

@karlicoss
Copy link
Owner

Thanks! I literally never received or sent an MMS, so don't really have any option, happy to merge :)

@karlicoss karlicoss merged commit 35dd5d8 into karlicoss:master Jun 5, 2024
@purarue
Copy link
Contributor Author

purarue commented Jun 5, 2024

never recieved or sent an MMS

I didnt think I had that many either, but any group chat turns itself into an MMS because theres no other way to encode who the message is coming from when there are multiple people (see comment)...

So, apparently I had a couple hundred because am in group chats with family etc.

I mostly care about this for random images that are stored in it though, am going to embed the correct date into that and add it to my eventual photos module.

thanks 👍

@purarue
Copy link
Contributor Author

purarue commented Jun 5, 2024

Oh, you may have to edit this line then, I assumed you had a few so I just put the standard, eh, "check if theres 10 things in this" test

@karlicoss
Copy link
Owner

Yeah, noticed this as well, but sadly this isn't running regularly anyway. Ideally we'd add some test file to https://github.com/karlicoss/hpi-testdata or something like that, so it can run on CI

@purarue
Copy link
Contributor Author

purarue commented Jun 5, 2024

Yeah, I always get paranoid about pushing location data test files or a modified xml export just because I always think Ive missed something, I've been wanting to write some tool that takes JSON/XML as input and creates valid-looking dummy-data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants