Geekzone: technology news, blogs, forums
Guest
Welcome Guest.
You haven't logged in yet. If you don't have an account you can register now.


Delphinus

611 posts

Ultimate Geek
+1 received by user: 274


#295487 1-Apr-2022 12:17
Send private message

We're trying to import (reverse engineer) some data from someone elses MS SQL database, without any vendor support.

 

In the past the data has been stored in either plain text or RTF so easy to extract. But this database has some 'encrypted' looking content like this:

 

 

Any thoughts on how to decode it? There are other rows in the same table that are plain text but most are this format.


View this topic in a long page with up to 500 replies per page Create new topic
 1 | 2

xpd

xpd
Geek of Coastguard
14193 posts

Uber Geek
+1 received by user: 4659

Retired Mod
ID Verified
Trusted
Lifetime subscriber

  #2895291 1-Apr-2022 12:30
Send private message

Without knowing the encryption method, think you're stuck unfortunately.

 

 





XPD / Gavin

 

LinkTree

 

 

 




Delphinus

611 posts

Ultimate Geek
+1 received by user: 274


  #2895292 1-Apr-2022 12:32
Send private message

Is it actually encrypted in your opinion based on that screenshot? I was wondering if it was some other file format like doc/docx/image file etc.

 

Passwords are stored in plain text! This is the only table I can find with anything encrypted.


Behodar
11207 posts

Uber Geek
+1 received by user: 6232

Trusted
Lifetime subscriber

  #2895296 1-Apr-2022 12:47
Send private message

Export one of the fields to a binary file then run the Unix file command on it and see whether it returns anything.




pih

pih
670 posts

Ultimate Geek
+1 received by user: 359

Lifetime subscriber

  #2895469 1-Apr-2022 16:00
Send private message

Behodar:

Export one of the fields to a binary file then run the Unix file command on it and see whether it returns anything.



Along these lines, use the sys.fn_varbintohexstr function to extract the first several bytes in hex and compare against file signatures: https://en.wikipedia.org/wiki/List_of_file_signatures

Does nothing about the metadata indicate what it might be?

Delphinus

611 posts

Ultimate Geek
+1 received by user: 274


  #2895476 1-Apr-2022 16:14
Send private message

pih:

 

Does nothing about the metadata indicate what it might be?

 

 

Oh it's definitely a document of some form, RTF etc. Is what what you mean?

 

We've just never seen it in this format before when dealing with databases from this vendor.


nzkc
1638 posts

Uber Geek
+1 received by user: 1043


  #2895498 1-Apr-2022 17:55
Send private message

A bit of left field approach might be to set up full text support on the table and column(s). With that you can state its format, so you'll have to guess a few and it'll interpret it. If you get it right, you can query for words in the content then which may confirm its type for you.

 

It'll be something like PDF format as a total guess.

 

What you migtht have a problem with is that it looks, based on the column name, is that its stored as a character type (VARCHAR, TEXT etc) rather than a binary type. You might have lost some true information from it based on however the insert was performed.


 
 
 
 

Shop now for Dell laptops and other devices (affiliate link).
Linux
12360 posts

Uber Geek
+1 received by user: 8650

Trusted
Lifetime subscriber

  #2895501 1-Apr-2022 18:24
Send private message

I was thinking .pdf

davidcole
6112 posts

Uber Geek
+1 received by user: 1476

Trusted

  #2895503 1-Apr-2022 18:37
Send private message

nzkc:

 

A bit of left field approach might be to set up full text support on the table and column(s). With that you can state its format, so you'll have to guess a few and it'll interpret it. If you get it right, you can query for words in the content then which may confirm its type for you.

 

It'll be something like PDF format as a total guess.

 

What you migtht have a problem with is that it looks, based on the column name, is that its stored as a character type (VARCHAR, TEXT etc) rather than a binary type. You might have lost some true information from it based on however the insert was performed.

 

 

nah pdfs tend to be fairly readable in text editors.  And normally start something like %pdf

 

Eg, my power bill

 

%PDF-1.4
%âãÏÓ

 

(I work in pdfs all day storing them)

 

Unless with the inserting to a varchar, it's being stripped of all the high order characters.

 

 

 

 





Previously known as psycik

Home Assistant: Gigabyte AMD A8 Brix, Home Assistant with Aeotech ZWave Controller, Raspberry PI, Wemos D1 Mini, Zwave, Shelly Humidity and Temperature sensors
Media:Chromecast v2, ATV4 4k, ATV4, HDHomeRun Dual
Server
Host Plex Server 3x3TB, 4x4TB using MergerFS, Samsung 850 evo 512 GB SSD, Proxmox Server with 1xW10, 2xUbuntu 22.04 LTS, Backblaze Backups, usenetprime.com fastmail.com Sharesies Trakt.TV Sharesight 


gzt

gzt
19007 posts

Uber Geek
+1 received by user: 8101

Lifetime subscriber

  #2895512 1-Apr-2022 18:59
Send private message

Provide the other columns? There might be a clue.

Delphinus

611 posts

Ultimate Geek
+1 received by user: 274


  #2895539 1-Apr-2022 22:37
Send private message

gzt: Provide the other columns? There might be a clue.

 

You might be onto something, I note the column zipped is 1 for these 'encrypted' rows.

 

[counter]
      ,[ctrPatient]
      ,[ctrUser]
      ,[date]
      ,[entered]
      ,[type]
      ,[number]
      ,[keyfield]
      ,[letterto]
      ,[copies]
      ,[text]
      ,[printed]
      ,[deleted]
      ,[bywhom]
      ,[dWhen]
      ,[ctrOld]
      ,[ctrReferral]
      ,[RsdType]
      ,[ctrLoginCreate]
      ,[ctrFormLetter]
      ,[ctrLoginCheck]
      ,[DateChecked]
      ,[Zipped]
      ,[not_PrescrAllowSubst]
      ,[not_MedicoLegal]
      ,[not_ctrNoteSubType]
      ,[not_ctrLabMessage]
      ,[not_CallingMode]
      ,[not_SourceTable]
      ,[not_SourceRecord]
      ,[not_ctrHealthDocSubType]
      ,[not_ctrEmailFile]
      ,[not_Comment]
      ,[not_eAcc18]
      ,[not_WasEmailed]
      ,[not_WordCount]
      ,[not_PrescrDrugsOfDependence]

 

 

 

However I'm not sure how to export it to a file we can work with.

 

Got this from my colleague I'm working with:

 

 

 

There are a string of characters in the DB with latin1 encoding, which is apparently the predecessor of windows cp1252. Although not even latin1 encoding, but latin1_general_ci collation, which (I'm not completely sure, but) is encoding+charset, so the charset could be different, but let's assume we just have some text with latin1 encoding and omit the charset. I read the text as string from the DB, need to convert it to bytes in order to write to file. In order to convert to binary bytes, I need the encoding, but when I set latin1 as the encoding, it threw 'unsupported characters' error. So I could only binarize it with CP1252.
Wrote that into a binary file.
Did file binaryfile.bin  which results in 'data' - unknown file format

 

 


Delphinus

611 posts

Ultimate Geek
+1 received by user: 274


  #2895541 1-Apr-2022 22:47
Send private message

Because it's saved as text (not blob or binary) how do you export it?


 
 
 

Shop now on AliExpress (affiliate link).

gzt

gzt
19007 posts

Uber Geek
+1 received by user: 8101

Lifetime subscriber

  #2895565 2-Apr-2022 08:55
Send private message

That character encoding test might be giving odd results. I don't suppose it's as easy as coping that text to a file and giving that file a zip extension, or even giving that binary a zip extension?

My picks would be WinZip and WinRAR for variety of data handled and recovery. 7zip might be worth a look too.

Delphinus

611 posts

Ultimate Geek
+1 received by user: 274


  #2895576 2-Apr-2022 09:40
Send private message

Just tried those extensions, no luck :(


davidcole
6112 posts

Uber Geek
+1 received by user: 1476

Trusted

  #2895611 2-Apr-2022 11:31
Send private message

If it’s potentially a zip file converted to text it the database, you probably need to work out what the binary to character conversion is to reverse that.   Then you’d have a binary which should be some sort of archive.

 

Maybe base64 encoding or something like that, and just casting binary to string would result in loss.   And I assume this stuff has been extracted in the past. 





Previously known as psycik

Home Assistant: Gigabyte AMD A8 Brix, Home Assistant with Aeotech ZWave Controller, Raspberry PI, Wemos D1 Mini, Zwave, Shelly Humidity and Temperature sensors
Media:Chromecast v2, ATV4 4k, ATV4, HDHomeRun Dual
Server
Host Plex Server 3x3TB, 4x4TB using MergerFS, Samsung 850 evo 512 GB SSD, Proxmox Server with 1xW10, 2xUbuntu 22.04 LTS, Backblaze Backups, usenetprime.com fastmail.com Sharesies Trakt.TV Sharesight 


Delphinus

611 posts

Ultimate Geek
+1 received by user: 274


  #2895659 2-Apr-2022 11:40
Send private message

What do you mean this stuff has been extracted in the past? We've done imports of this data in the past (ie Dec 2021/Jan 2022) when it was just RTF stored as text in the database. Vendor (our competition) seems to have changed their storage method in the past few months, just for this table.


 1 | 2
View this topic in a long page with up to 500 replies per page Create new topic








Geekzone Live »

Try automatic live updates from Geekzone directly in your browser, without refreshing the page, with Geekzone Live now.



Are you subscribed to our RSS feed? You can download the latest headlines and summaries from our stories directly to your computer or smartphone by using a feed reader.