www.7-zip.org/recover.html

   1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">\r
   2 <HTML>\r
   3 <HEAD>\r
   4 <META http-equiv="CONTENT-TYPE" content="TEXT/HTML; CHARSET=UTF-8">\r
   5 <LINK href="style.css" rel="stylesheet" type="text/css">\r
   6 <!-- HeadSubstituteBegin -->\r
   7 <TITLE>How to recover corrupted 7z archive</TITLE>\r
   8 <META name="keywords" content="7z, lzma, 7-zip, archiver, free, compression, zip, \r
   9 best, compress, solid, high, ratio, unzip, far, win32, 7zip, long, file, names">\r
  10 <!-- HeadSubstituteEnd -->\r
  11 </HEAD>\r
  12 <BODY>\r
  13 \r
  14 <!-- to page translators: this page describes some technical things and\r
  15 can be changed several times in future. It can be difficult for you\r
  16 to support updated version of translation. So don't translate it and \r
  17 just link to this original English version of page --> \r
  18 \r
  19 <H1>How to recover corrupted 7z archive</H1>\r
  20 \r
  21 <H2>Try latest version of 7-Zip</H2>\r
  22 \r
  23 <P>It's possible that new version of 7-Zip can solve your problems with 7z archives.\r
  24 So download latest version of 7-Zip and try to use that new version.\r
  25 You can try also latest alpha or beta version.\r
  26 If new version also doesn't help, read this manual.</P>\r
  27 \r
  28 <H2>Required software:</H2>\r
  29 <UL>\r
  30   <LI>7-Zip (latest version, that can be stable, alpha or beta version). \r
  31   <LI>Some program with hex viewer or editor,\r
  32       for example, <A href="http://www.farmanager.com">FAR Manager</A>.\r
  33 </UL>\r
  34 \r
  35 <H2>7z archive structure</H2>\r
  36 \r
  37 <P>7z archive consists of 4 main blocks of data:\r
  38 <OL>\r
  39   <LI>Start Header (32 bytes): it contains signature and link to End Header\r
  40   <LI>Compressed Data of files \r
  41   <LI>Compressed Metadata Block for files: it contains links to Compressed Data, information about compression methods, CRC, file names, sizes, timestamps and so on.\r
  42   <LI>End Header: it contains link to Compressed Metadata Block. \r
  43 </OL>\r
  44 \r
  45 Note: If 7z archive contains only one file without encryption, \r
  46       7-Zip stores Metadata for that file in End Header in uncompressed form, \r
  47       and there are only 3 main blocks in that case.\r
  48 \r
  49 <H3>Archive example</H3>\r
  50 \r
  51 <P>Archive example: a.7z (3740 bytes) that contains 5 files compressed  with LZMA method.\r
  52 <P>Start of archive:</P>\r
  53 <PRE>\r
  54 0000000000: 37 7A BC AF 27 1C 00 04 5B 38 BE F9 59 0E 00 00 \r
  55 0000000010: 00 00 00 00 23 00 00 00 00 00 00 00 7A 63 68 FD \r
  56 0000000020: 00 21 16 89 6C 71 3D AB 7D 89 E6 3C 2E BE 60 24 \r
  57 \r
  58 00: 6 bytes: 37 7A BC AF 27 1C        - Signature \r
  59 06: 2 bytes: 00 04                    - Format version\r
  60 08: 4 bytes: 5B 38 BE F9              - CRC of the following 20 bytes\r
  61 0C: 8 bytes: 59 0E 00 00 00 00 00 00  - relative offset of End Header\r
  62 14: 8 bytes: 23 00 00 00 00 00 00 00  - the length of End Header\r
  63 1C: 4 bytes: 7A 63 68 FD              - CRC of the End Header\r
  64 \r
  65 Relative offset of End Header is relative from the end of Start Header,\r
  66 that is at offset 0x20 (32 in decimal).\r
  67 Real offset of End Header in example archive = 0x20 + 0x0E59 = 0x0E79\r
  68 \r
  69 20: 00 21 16 89 ... - start of compressed data. \r
  70     Note: if the file was compressed with LZMA method, the first byte \r
  71           is always 00. If first byte is not 00, then archive uses\r
  72           another method (it can be LZMA2 or encrypted data with AES).\r
  73 </PRE>\r
  74 \r
  75 <P>End of archive:</P>\r
  76 <PRE>\r
  77 End Header (offset = 0x0E59, length = 0x23):\r
  78 \r
  79 0000000E70:                            17 06 8D AD 01 09 80 \r
  80 0000000E80: AC 00 07 0B 01 00 01 23 03 01 01 05 5D 00 10 00 \r
  81 0000000E90: 00 0C 81 1A 0A 01 3C 70 52 F7 00 00             \r
  82 \r
  83 Possible values for first byte in End Header:\r
  84    17 - End Header contains the link to Metadata Block.\r
  85    01 - Metadata block is stored in End Header.\r
  86 </PRE>\r
  87 \r
  88 <H2>Corruption types</H2>\r
  89 \r
  90 <P>There are some possible cases when archive is corrupted:\r
  91 <UL>\r
  92   <LI>You can open archive and you can see the list of files, but when you press \r
  93       Extract or Test command, there are some errors: Data Error or CRC Error.\r
  94   <LI>When you open archive, you get message "Can not open file 'a.7z' as archive"\r
  95 </UL>\r
  96 \r
  97 <H2>Corruption case: Data errors or CRC errors for files inside archive</H2>\r
  98 \r
  99 <P>Here we describe the case, when you can open archive and you see the list of files, \r
 100 but when you press Extract or Test command, there are some errors: Data Error or CRC Error.</P>\r
 101 \r
 102 <P>It's pretty difficult to recover data for that case.</P>\r
 103 \r
 104 <P>If archive was compressed in "Solid" mode, and you have exact copies of\r
 105 some files from archive, you can create similar archive with good copies of \r
 106 files with same settings and in same order, and replace "bad" parts of bad.7z \r
 107 with "good" parts from another good.7z. You must look listings of files in bad and \r
 108 good archives, logs of "test" command, and think about ways to replace bad parts.\r
 109 \r
 110 <P>The are no more instructions here for that corruption case.</P>\r
 111 \r
 112 <H2>Corruption case: Can not open file 'a.7z' as archive</H2>\r
 113 \r
 114 <P>If you try to open or extract archive and you see the message \r
 115 "Can not open file 'a.7z' as archive", it means that 7-Zip can't open some\r
 116 header from the start or from the end of archive.</P>\r
 117 \r
 118 <P>In that case you must open archive in hex editor and look to Start Header and End Header.</P>\r
 119 \r
 120 <P>Possible cases:</P>\r
 121 \r
 122 <UL>\r
 123 \r
 124 <LI>Case: If start of archive is corrupted, then there is no link to End Header.\r
 125    But if the End Header is OK, and the size of archive is also correct, \r
 126    you can replace data in Start Header in hex editor to the following values: \r
 127 <PRE>\r
 128 0000000000: 37 7A BC AF 27 1C 00 04 00 00 00 00 00 00 00 00\r
 129 0000000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00\r
 130 </PRE>\r
 131 \r
 132 <P>Then try to open archive, if you can open and you see the list of files, \r
 133 try Test or Extract command. Look also "Data errors or CRC errors" section in this page.</P>\r
 134 \r
 135 <LI>Case: Start Header and End Header are OK, but total size of archive is not OK.\r
 136 You can calculate correct size of archive from values in Start Header.\r
 137 Then you must recover correct size. You can insert some data or remove some data \r
 138 somewhere in archive (for example, at offset of several MBs before the end of archive).\r
 139 \r
 140 <P>\r
 141 For example, if you have multi-volume archive: a.7z.001, ... , a.7z.009, \r
 142 but one part a.7z.008 is missing,\r
 143 just copy a.7z.007 to file a.7z.008, and 7-Zip will see correct size of archive.\r
 144 Or if some part was reduced, look the size of another parts and restore original (correct) \r
 145 size of "bad" part, so total size will be correct again, and 7-zip will be able \r
 146 to open headers.</P>\r
 147 \r
 148 <LI>Case: The end of archive is corrupted or missing. The following text describes that case.\r
 149 \r
 150 </UL>\r
 151 \r
 152 <H2>There is no correct End Header at the end of archive</H2>\r
 153 \r
 154 <P>7-Zip writes full Start Header only at the end of archive creation operation.</P>\r
 155 \r
 156 <P>You can look to Start Header. If you see signature with version and zeros in another fields:</P>\r
 157 <PRE>\r
 158 0000000000: 37 7A BC AF 27 1C 00 04 00 00 00 00 00 00 00 00\r
 159 0000000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00\r
 160 </PRE>\r
 161 <P>then archive creation operation probably was interrupted by some reason.\r
 162 And in that case probably there are no Metadata Block and End Header at the end of archive.</P>\r
 163 \r
 164 <P>Note: If archive is multi-volume, uncompleted Start Header is also possible, \r
 165 if first volume was copied before end of archive (last volume) was written.\r
 166 In that case archive is not corrupted. And 7-Zip can unpack such archive, if total \r
 167 size is correct and if there is correct End Header.</P>\r
 168 \r
 169 <P>If Start Header is OK, you can calculate correct archive size and compare with the size \r
 170 of archive that you have.</P>\r
 171 \r
 172 <P>If there is no End Header, you can not recover file names, timestamps, and another metadata, \r
 173 but probably it's possible to recover some data as raw file, and then it's possible to \r
 174 recover data from raw file with some parser.</P>\r
 175 \r
 176 <P>We describe all steps with example:</P>\r
 177 <OL>\r
 178   <LI> Create good 7z archive\r
 179   <LI> Corrupt archive\r
 180   <LI> Recover files from corrupted archive\r
 181 </OL>\r
 182 \r
 183 <H2>Create good archive</H2>\r
 184 \r
 185 <P>We create some good archive. We use readme.txt (1565 bytes) form 7-Zip 9.20 as example file.</P>\r
 186 <P> Create readme.txt.bz2, readme.zip, readme.txt.gzip and readme.txt.xz archives from readme.txt.<P>\r
 187 <P> Create a.7z with LZMA method that contains all files:\r
 188 <PRE>\r
 189   readme.txt.bz2\r
 190   readme.txt.gz\r
 191   readme.zip\r
 192   readme.txt\r
 193   readme.txt.xz\r
 194 </PRE>\r
 195 <P>We have a.7z (3740 bytes). You can look that file in hex editor.\r
 196 It must have structure similar to structure of 7z file described above.</P>\r
 197 \r
 198 <H2>Corrupt archive</H2>\r
 199 \r
 200 <P>Now we currupt a.7z archive. We want to split archive into two parts:\r
 201 <UL>\r
 202   <LI> a.7z.001: Start Header, some part of Compressed Data\r
 203   <LI> a.7z.002: Some part of Compressed Data, Metadata, End Header\r
 204 </UL>\r
 205 \r
 206 <P>Metadata block with End Header are not big for our test archive (smaller than 300 bytes).</P>\r
 207 \r
 208 <P>We call "Split file..." command in 7-Zip File Manager and type "3000 100G" in "Split to volumes, bytes:" field (100G means that second part can not be larger than 100 GB).</P>\r
 209 \r
 210 <P>We have two parts: a.7z.001 (3000 bytes) and a.7z.002 (740 bytes).\r
 211 Then we copy a.7z.001 to bad.7z and try to open bad.7z. And we get\r
 212 the message "Can not open file 'bad.7z' as archive", so we have corrupted archive.</P>\r
 213 \r
 214 <H3>Recover archive</H3>\r
 215 \r
 216 <P>We open bad.7z in hex editor</P>\r
 217 \r
 218 <PRE>\r
 219 0000000000: 37 7A BC AF 27 1C 00 04 5B 38 BE F9 59 0E 00 00\r
 220 0000000010: 00 00 00 00 23 00 00 00 00 00 00 00 7A 63 68 FD\r
 221 0000000020: 00 21 16 89 6C 71 3D AB 7D 89 E6 3C 2E BE 60 24\r
 222 </PRE>\r
 223 \r
 224 <p>We see that Start Header is OK.</P>\r
 225 <P>We calculate correct archive size from Start Header fields values:</P>\r
 226 <P>0x0E59 + 0x20 + 0x23 = 0x0E9C = 3740</P>\r
 227 <P>Correct size is 3740 bytes, but our "bad.7z" is only 3000 bytes.</P>\r
 228 \r
 229 <P>We look to the end of archive:</P>\r
 230 <PRE>\r
 231 0000000B60: 55 73 EA 87 45 18 FC AD 67 0D 40 EF F4 41 49 63\r
 232 0000000B70: 6A 87 54 70 32 6C B0 8F 76 2A 63 BF 12 5D 88 CD\r
 233 0000000B80: 22 76 9F 97 05 3B 37 BE 49 CD F8 0A CC 67 FB FE\r
 234 0000000B90: 17 2E 16 D5 1F 8C 5A 30 08 7F C6 E9 98 9F 00 F1\r
 235 0000000BA0: A6 99 F9 ED 01 62 84 48 77 69 C7 65 21 21 42 66\r
 236 0000000BB0: 48 F1 FE 79 06 08 25 68\r
 237 </PRE>\r
 238 \r
 239 <P>And we don't see End Header at the end of archive.</P>\r
 240 \r
 241 <P>Conclusion: archive probably was truncated.</P>\r
 242 \r
 243 <P>Now we want to create another "good" 7z archive that contains good Start Header, End Header.\r
 244 and we want to place Compressed Data block from bad.7z inside that new "good" archive.</P>\r
 245 \r
 246 <P>At first we look start of Compressed Data block  in bad.7z:</P>\r
 247 <PRE>\r
 248 0000000020: 00 21 16 89 6C 71 3D AB 7D 89 E6 3C 2E BE 60 24\r
 249 </PRE>\r
 250 \r
 251 <P>If LZMA method was used, then first byte in compressed data is always 0 and \r
 252 high bit of second byte is also 0. So if we see 00 in first byte and from 00 to 7F in second byte, probably LZMA method was used (not LZMA2).</P>\r
 253 \r
 254 <P>If first byte in compressed data is not 0 or if the value of second byte is higher\r
 255  then 7F, then it's not LZMA stream. It can be LZMA2 (or AES encrypted stream).</P>\r
 256 \r
 257 <P>We must create new "good" 7z archive with same method as in bad.7z, and new archive \r
 258 must be much larger than bad.7z</P>\r
 259 \r
 260 <P>So we select some big file for that new archive. In some cases you can use even bad.7z as \r
 261 that big file. \r
 262 But we use 7-zip.chm. We rename 7-zip.chm (91020 bytes) to file raw.dat and we compress \r
 263 raw.dat to raw.7z with LZMA method with big dictionary size value. The dictionary size \r
 264 must be equal or larger than dictionary size in bad.7z.</P>\r
 265 \r
 266 <P>raw.7z is (84898 bytes) that is much larger than bad.7z, as required. if raw.7z \r
 267 is smaller than "bad.7z", you must create another raw.7z with another raw.dat that is larger.</P>\r
 268 \r
 269 <P>We call "Split file..." function for bad.7z and type "32 100G" in "Split to volumes, bytes:" field.\r
 270 <P>It creates 2 parts:</P>\r
 271 <UL>\r
 272   <LI>bad.7z.001:   32 bytes : Start Header\r
 273   <LI>bad.7z.002: 2968 bytes : start of Compressed Data\r
 274 </UL>\r
 275 \r
 276 <P>We call "Split file..." function for raw.7z and type "32 2968 100G" in "Split to volumes, bytes:" field. Note that the value 2968 is equal to size of "bad.7z.002". When you recover \r
 277 real archive, you must use exact size of your bad.7z.002.</P>\r
 278 \r
 279 <P>It creates 3 parts:</P>\r
 280 <UL>\r
 281   <LI>raw.7z.001:   32 bytes : Start Header\r
 282   <LI>raw.7z.002: 2968 bytes : start of Compressed Data\r
 283   <LI>raw.7z.003: 81898 bytes : end of Compressed Data, Metadata Block, End Header\r
 284 </UL>\r
 285 \r
 286 <P>Then we rename bad.7z.002 file to raw.7z.002</P>\r
 287 \r
 288 <P>Now multi-volume "raw.7z.*" archive contains good headers from raw.7z and compressed data from "bad.7z"</P>\r
 289 \r
 290 <P>We press "Extact" for raw.7z.001 file. It will extract raw.dat file and probably it will show "Data Error" message.</P>\r
 291 \r
 292 <P>Now we have raw.dat file that contains recovered stream from bad.7z.</P>\r
 293 \r
 294 <P>Most of 7z archives are solid. If bad.7z archive was solid, then recovered stream\r
 295 consists of concatenated original files</P>\r
 296 \r
 297 <P>If bad.7z archive is not solid, then recovered stream\r
 298 contains data for one file. It can also contain some garbage data at the end.</P>\r
 299 \r
 300 <H2>Parsing raw stream for recovered solid archive</H2>\r
 301 \r
 302 <P>No we must use some parser software that will look raw.dat, search file \r
 303 signatures and extract some files from that file.</P>\r
 304 \r
 305 <P>You can try to use parser from 7-Zip.</P>\r
 306 \r
 307 <P>You need 7-Zip 9.34 alpha or later version.</P>\r
 308 \r
 309 <P>Select raw.dat and call context menu command "7-Zip > Open Archive > #" </P>\r
 310 <P>It shows:<P>\r
 311 <PRE>\r
 312   1.bz2\r
 313   2.readme.txt.gz\r
 314   3.zip\r
 315   4\r
 316   5.xz\r
 317 </PRE>\r
 318 \r
 319 <P> Press Extract command to extract these files.</P>\r
 320 \r
 321 <P> So we have recovered some of the original files, but without original names.</P>\r
 322 \r
 323 <P> 7-Zip parser can find archives in raw file. But it doesn't recognize another files, \r
 324 like xml, html, jpg, png files and so on. \r
 325 So probably you need some another parser software to extract files from raw file.</P>\r
 326 \r
 327 <H2>Recovering non-solid archives and archives with multiple solid blocks</H2>\r
 328 \r
 329 <P>If there are more than one solid block in 7z archive, you must detect exact end of solid block, and start of next solid block.</P>\r
 330 \r
 331 <P>The recovering procedure for that case will be described in future</P>\r
 332 \r
 333 </BODY>\r
 334 </HTML>