1

Can anyone spot the error why it gives me the wrong value every time, never the correct one?

private long Main(string bzip2FilePath)
{
    long totalUncompressedSize = 0;

    try
    {
        
        var compressedDataByteArray = File.ReadAllBytes(bzip2FilePath);

       
        using (var mstream = new MemoryStream(compressedDataByteArray))
        
        using (var unzipstream = new BZip2InputStream(mstream))
       
        using (var reader = new StreamReader(unzipstream))
        {
       
            char[] buffer = new char[4096];
            int bytesRead; 
            while ((bytesRead = reader.Read(buffer, 0, buffer.Length)) > 0)
            {
                totalUncompressedSize += bytesRead;
            }
        }
    }
    catch (Exception ex)
    {
        MessageBox.Show($"Error calculating BZip2 uncompressed size: {ex.Message}");
    }

    return totalUncompressedSize;
}
3
  • In what way are the results wrong? Is it to short? To long? If so, by how much? And what are you comparing the result to? Commented Jul 5 at 6:05
  • Why are you using a StreamReader to read text and then counting bytes read? Text is not bytes. It's characters. If you want to know how many bytes there are then you should be reading the raw data. Character count and byte count will only be the same if every character is one byte, which may be the case but probably isn't. Commented Jul 5 at 6:13
  • Please read this and edit your question accordingly. We should be able to run the code you provide and see the errant behaviour you describe, so that would involve starting with some data of a known size to compare to. Commented Jul 5 at 9:40

2 Answers 2

0

I think the problem is that StreamReader reads characters, not bytes, and thus size calculations might be wrong because of conversions in the encoding. You instead need to use a BinaryReader and read bytes.

What i can suggest is that you may use this:

using (var reader = new BinaryReader(unzipstream))

instead of this: using (var reader = new StreamReader(unzipstream))

So, as a result, you may also want to update the associated types in your code too: like this:

byte[] buffer = new byte[4096];

and the while loop :

while ((bytesRead = reader.Read(buffer, 0, buffer.Length)) > 0)
        {
            totalUncompressedSize += bytesRead;
        }
0

Unclear why you are using a StreamReader at all. You just need to get the number of bytes, not characters. You also don't need to cache the whole file in a byte array.

using var file = File.Open(bzip2FilePath);
using var unzipstream = new BZip2InputStream(file);
if (unzipstream.CanSeek)   // grab the length if we can
    return unzipstream.Length;

// otherwise loop the stream
var totalBytes = 0;
var buffer = new byte[8192];
int bytesRead; 
while ((bytesRead = reader.Read(buffer, 0, buffer.Length)) > 0)
{
    totalBytes += bytesRead;
}
return totalBytes;

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.