<div dir="ltr">Vdaka pomohlo :-)<br></div><br><div class="gmail_quote"><div dir="ltr">On Tue, 8 May 2018 at 15:26, Michal Molhanec <<a href="mailto:mol-python@seznam.cz">mol-python@seznam.cz</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
  
    
  
  <div text="#000000" bgcolor="#FFFFFF">
    <p>Ahoj,</p>
    <p>zkusil bych si říct přímo o binární data:<br>
    </p>
    <p>req = requests.get(file_url, allow_redirects=True, stream=True,
      headers={"Accept": "application/vnd.github.v3.raw"})</p>
    Viz <a class="m_-6827324699728450330moz-txt-link-freetext" href="https://developer.github.com/v3/git/blobs/#custom-media-types" target="_blank">https://developer.github.com/v3/git/blobs/#custom-media-types</a><br>
    <br>
    S requests ani GITem ale nedělám.<br>
    Nicméně tohle mi jsem v rychlosti otestoval a fungovalo mi<br>
    <br>
    Zdraví<br>
    Michal<br>
    <br>
    <div class="m_-6827324699728450330moz-cite-prefix">Dne 08.05.2018 v 11:01 ZdPo Ster
      napsal(a):<br>
    </div>
    <blockquote type="cite">
      <div dir="ltr">
        <div>
          <div>
            <div>
              <div>
                <div>
                  <div>
                    <div>Ahojte,<br>
                      <br>
                    </div>
                    viem stiahnut velky subor z z githubu takto:<br>
                    <br>
                    <span style="font-family:monospace,monospace">file_content
                      = requests.get(file_url, allow_redirects=True)<br>
                      file_data = base64.b64decode(file_content.content)<br>
                      open(output, 'wb').write(file_data)</span><br>
                    <br>
                  </div>
                  Kedze to dlho trva, chcem tam implementovat
                  progressbar a tu zacinaju moje problemy ;-). Nasiel
                  som ze by malo fungovat nieco taketo:<br>
                  <br>
                  <span style="font-family:monospace,monospace">file_size
                    = 19335882  # toto viem vopred<br>
                    req = requests.get(file_url, allow_redirects=True,
                    stream=True)<br>
                    block_size = 1024<br>
                    num_bars = file_size / (block_size*2)<br>
                    bar = Bar(f'Downloading {filename}', max=num_bars,<br>
                                  suffix='%(percent).1f%% - %(eta)ds')<br>
                    bytes_transferred = 0<br>
                    with open(output, "wb") as file:<br>
                            for chunk in
                    req.iter_content(chunk_size=block_size):<br>
                                bytes_transferred+= len(chunk)<br>
                                if chunk:<br>
                                    file.write(chunk)<br>
                                 bar.next()<br>
                        bar.finish()<br>
                  </span></div>
                <span style="font-family:monospace,monospace">print(bytes_transferred)</span><br>
                <br>
              </div>
              Moje problem: Velkost prenesenych dat nesedi s velkostou
              suboru (26640760 vs 19335882 t.j. progress bar nezobrazuje
              korektny progress) z dovodu, ze github namiesto suboru
              posiela subor zabaleny v json a encodovany v base64.<br>
              <br>
            </div>
            Workaround by mohol byt, ze ak viem velkost finalneho
            suboru, pokusim sa vypocitat velkost json filu
            (req.headers.get('Content-Length') v tomto pripade na
            githube nefunguje :-( ). Z neho by som po stiahnuti do
            pamate extrahoval content, dekodoval ho a az potom ulozil...
            Otazkou je ci nie je inteligentnejsi sposob ako to urobit...<br>
            <br>
          </div>
          PS: moja testovacia url je: <a href="https://api.github.com/repos/tesseract-ocr/tessdata/git/blobs/b01dab8de8174496a0012bf85296943b3e7c81d7" target="_blank">https://api.github.com/repos/tesseract-ocr/tessdata/git/blobs/b01dab8de8174496a0012bf85296943b3e7c81d7</a><br>
          <br>
          <br>
        </div>
        Zd.<br>
      </div>
      <br>
      <fieldset class="m_-6827324699728450330mimeAttachmentHeader"></fieldset>
      <br>
      <pre>_______________________________________________
Python mailing list
<a class="m_-6827324699728450330moz-txt-link-abbreviated" href="mailto:python@py.cz" target="_blank">python@py.cz</a>
<a class="m_-6827324699728450330moz-txt-link-freetext" href="http://www.py.cz/mailman/listinfo/python" target="_blank">http://www.py.cz/mailman/listinfo/python</a>

Visit: <a class="m_-6827324699728450330moz-txt-link-freetext" href="http://www.py.cz" target="_blank">http://www.py.cz</a>
</pre>
    </blockquote>
    <br>
  </div>

_______________________________________________<br>
Python mailing list<br>
<a href="mailto:python@py.cz" target="_blank">python@py.cz</a><br>
<a href="http://www.py.cz/mailman/listinfo/python" rel="noreferrer" target="_blank">http://www.py.cz/mailman/listinfo/python</a><br>
<br>
Visit: <a href="http://www.py.cz" rel="noreferrer" target="_blank">http://www.py.cz</a><br>
</blockquote></div>