[C++] File Downloading
by Robert Fritzen · in Technical Issues · 09/09/2012 (12:55 pm) · 10 replies
So, I'm updating my patcher client for my game projects, the only problem is that fileDownloads don't work for the .dll/.exe from inside the game it's being ran on. So I need an alternate, and an external C++ patcher seems to be the best way to go here.
I've tried using libCURL to accomplish this task, but it too is failing to completely download the .exe/.dll files. Here is a code snippet to download the files, do you have any suggestions to fix this problem? Is there a better way (That does not include switching to another language) to do this?
This currently has all of my projects at a deadstop, because I am in desperate need of a working patching system that can handle the .exe/.dll updates, so as you can imagine, priority here is getting this done and working quickly.
I've tried using libCURL to accomplish this task, but it too is failing to completely download the .exe/.dll files. Here is a code snippet to download the files, do you have any suggestions to fix this problem? Is there a better way (That does not include switching to another language) to do this?
size_t write_data(void *ptr, size_t size, size_t nmemb, FILE *stream) {
size_t written = fwrite(ptr, size, nmemb, stream);
return written;
}
//downloadFile - Thread much?
void multiDownload::downloadFile(download_list_item *single) {
CURL *curl;
CURLcode res;
curl = curl_easy_init();
//create the path, and necessary directory scheme
std::string path = drcSet->obtWrkDir();
path.append("/");
path.append(single->filePath);
std::string path_only;
path_only = path.substr(0, path.find_last_of("/")).append("/");
boost::filesystem::create_directories(path_only);
//get the filename/url to download
std::string fullURL = single->getFP();
std::string fileName = single->filePath;
cout << "Download: " << fullURL << "|" << fileName;
//
//std::string data = uRead->readURL(fullURL.c_str());
//cout << endl << "DATA:" << endl << data << endl;
FILE *fp;
if (curl) {
fp = fopen(fileName.c_str(), "wb");
curl_easy_setopt(curl, CURLOPT_URL, fullURL.c_str());
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, write_data);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, fp);
res = curl_easy_perform(curl);
curl_easy_cleanup(curl);
fclose(fp);
}
}This currently has all of my projects at a deadstop, because I am in desperate need of a working patching system that can handle the .exe/.dll updates, so as you can imagine, priority here is getting this done and working quickly.
About the author
Illinois Grad. Retired T3D Developer / Pack Dev.
#2
What does the application look like to the server? What user agent is it telling the server it is? Some servers may be setup to block bots.
I know in Python using some of their web related code I explicitly set the user agent to a specific string. Then I use the server to check for that user agent before starting an authentication. Some servers may block downloads by stuff that looks like a bot. You may be able to tell your libraries to provide a user agent for something like Firefox. I know you can do this with wget as a command like option.
Which BTW, that might be an option if the wget program license allows it to be bundled with proprietary software. You certainly would not be compiling it into your app. It could be launched from a shell. Check the license carefully though...
09/11/2012 (11:16 pm)
@Robert,What does the application look like to the server? What user agent is it telling the server it is? Some servers may be setup to block bots.
I know in Python using some of their web related code I explicitly set the user agent to a specific string. Then I use the server to check for that user agent before starting an authentication. Some servers may block downloads by stuff that looks like a bot. You may be able to tell your libraries to provide a user agent for something like Firefox. I know you can do this with wget as a command like option.
Which BTW, that might be an option if the wget program license allows it to be bundled with proprietary software. You certainly would not be compiling it into your app. It could be launched from a shell. Check the license carefully though...
#3
There is also another problem. When I run the .exe alone (outside debugging in MSVC), it crashes, however if I debug it (while in release mode), the program runs through. A debug build is complaining about a heap error, but I'm not sure where this is originating from.
I've posted the source code here if you want to take a look at it and give me some help to get this working.
09/12/2012 (6:07 am)
The server works fine without the user agent string. I added one just to be sure with the same result. The problem is it only downloads exactly 420kb of .ddl and .exe files. I'm not sure if this is just a defaulting limit of cURL, in which case all I need to to find out how to bypass this. I should note that I've downloaded .cs, .mis, and .txt files perfectly fine under this program.There is also another problem. When I run the .exe alone (outside debugging in MSVC), it crashes, however if I debug it (while in release mode), the program runs through. A debug build is complaining about a heap error, but I'm not sure where this is originating from.
I've posted the source code here if you want to take a look at it and give me some help to get this working.
#4
curl.haxx.se/mail/lib-2003-04/0035.html
When I searched for download limit on curl I saw a link that talked about a 30MB limit and a 2GB limit. I don't think there is a limit at work here. I am grabbing the pycurl library for Python to see what I can try.
09/12/2012 (10:05 pm)
I downloaded your code, but I am not sure what I am looking at. But I did a search and found this thread. It has a simple fix that was a copy paste error, but it also has some code that might help. It suggested getting the command line version of curl and seeing if it still works. If it does it is a code problem. If not it could be a server issue.curl.haxx.se/mail/lib-2003-04/0035.html
When I searched for download limit on curl I saw a link that talked about a 30MB limit and a 2GB limit. I don't think there is a limit at work here. I am grabbing the pycurl library for Python to see what I can try.
#5
It is written in Python, but it may give you some ideas for options that might make your C version work better. This should be a fairly good translation of the curl api to a python library. They tend to follow very closely. I downloaded the 6 meg fractal image I posted here: www.garagegames.com/community/blogs/view/21873. It worked fine and did not error out early at all.
09/12/2012 (10:19 pm)
I found this code here:#! /usr/bin/env python
# -*- coding: iso-8859-1 -*-
# vi:ts=4:et
# $Id: retriever.py,v 1.19 2005/07/28 11:04:13 mfx Exp $
#
# Usage: python retriever.py <file with URLs to fetch> [<# of
# concurrent connections>]
#
import sys, threading, Queue
import pycurl
# We should ignore SIGPIPE when using pycurl.NOSIGNAL - see
# the libcurl tutorial for more info.
try:
import signal
from signal import SIGPIPE, SIG_IGN
signal.signal(signal.SIGPIPE, signal.SIG_IGN)
except ImportError:
pass
# Get args
num_conn = 10
try:
if sys.argv[1] == "-":
urls = sys.stdin.readlines()
else:
urls = open(sys.argv[1]).readlines()
if len(sys.argv) >= 3:
num_conn = int(sys.argv[2])
except:
print "Usage: %s <file with URLs to fetch> [<# of concurrent connections>]" % sys.argv[0]
raise SystemExit
# Make a queue with (url, filename) tuples
queue = Queue.Queue()
for url in urls:
url = url.strip()
if not url or url[0] == "#":
continue
filename = "doc_%03d.dat" % (len(queue.queue) + 1)
queue.put((url, filename))
# Check args
assert queue.queue, "no URLs given"
num_urls = len(queue.queue)
num_conn = min(num_conn, num_urls)
assert 1 <= num_conn <= 10000, "invalid number of concurrent connections"
print "PycURL %s (compiled against 0x%x)" % (pycurl.version, pycurl.COMPILE_LIBCURL_VERSION_NUM)
print "----- Getting", num_urls, "URLs using", num_conn, "connections -----"
class WorkerThread(threading.Thread):
def __init__(self, queue):
threading.Thread.__init__(self)
self.queue = queue
def run(self):
while 1:
try:
url, filename = self.queue.get_nowait()
except Queue.Empty:
raise SystemExit
fp = open(filename, "wb")
curl = pycurl.Curl()
curl.setopt(pycurl.URL, url)
curl.setopt(pycurl.FOLLOWLOCATION, 1)
curl.setopt(pycurl.MAXREDIRS, 5)
curl.setopt(pycurl.CONNECTTIMEOUT, 30)
curl.setopt(pycurl.TIMEOUT, 300)
curl.setopt(pycurl.NOSIGNAL, 1)
curl.setopt(pycurl.WRITEDATA, fp)
try:
curl.perform()
except:
import traceback
traceback.print_exc(file=sys.stderr)
sys.stderr.flush()
curl.close()
fp.close()
sys.stdout.write(".")
sys.stdout.flush()
# Start a bunch of threads
threads = []
for dummy in range(num_conn):
t = WorkerThread(queue)
t.start()
threads.append(t)
# Wait for all threads to finish
for thread in threads:
thread.join()It is written in Python, but it may give you some ideas for options that might make your C version work better. This should be a fairly good translation of the curl api to a python library. They tend to follow very closely. I downloaded the 6 meg fractal image I posted here: www.garagegames.com/community/blogs/view/21873. It worked fine and did not error out early at all.
#6
09/13/2012 (6:33 am)
I'll try it from the command line to see what happens.
#8
You need to take a slow walk and drink some coffee. Just step away from the computer. You have been staring at it too long...
Here is the 420 bytes I got from downloading your file:
It does not like spaces in the url. The spaces need to be converted to the space equivalent like %20 I think.
Change the url to this:
I just did and got a 590,336 bytes file which has an exe signature in the file (starts with MZ).
Edit:
It is not your code, it is not cURL, it is your freakin' URL. That sounds kind of catchy. Almost like a geek song...
09/13/2012 (9:33 pm)
@Robert,You need to take a slow walk and drink some coffee. Just step away from the computer. You have been staring at it too long...
Here is the 420 bytes I got from downloading your file:
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <html><head> <title>404 Not Found</title> </head><body> <h1>Not Found</h1> <p>The requested URL /public/TU0/Patcher/0.0.3.0/Tactical was not found on this server.</p> <p>Additionally, a 404 Not Found error was encountered while trying to use an ErrorDocument to handle the request.</p> <hr> <address>Apache Server at www.phantomdev.net Port 80</address> </body></html>
It does not like spaces in the url. The spaces need to be converted to the space equivalent like %20 I think.
Change the url to this:
http://www.phantomdev.net/public/TU0/Patcher/0.0.3./Tactical%20Uprising%20Beginnings.exe
I just did and got a 590,336 bytes file which has an exe signature in the file (starts with MZ).
Edit:
It is not your code, it is not cURL, it is your freakin' URL. That sounds kind of catchy. Almost like a geek song...
#9
EDIT:
Still crashing due to a heap error though... not sure why.
EDIT 2:
Got it! Replaced the const char *'s in the downloadListItem struct with std::string... works beautifully...
09/14/2012 (6:16 am)
LOL! Wow it was really just that? Hah, thanks a ton.EDIT:
Still crashing due to a heap error though... not sure why.
EDIT 2:
Got it! Replaced the const char *'s in the downloadListItem struct with std::string... works beautifully...
#10
No problem. I learned how to use cURL for Python. Thanks for the challenge.
09/14/2012 (5:54 pm)
@Robert,No problem. I learned how to use cURL for Python. Thanks for the challenge.

Torque Owner Robert Fritzen
Phantom Games Development
You do realize that this code does not work, right?