Need Advice: Store upload documents in MySQL or Server?

All,

I’m making this new registration app with scriptcase 8, and need advice. The application calls for users to upload files, namely PDFs and JPEGs to the server. However, the contents of the PDFs / JPEGS are to be secured and not world view-able. Users should not be able to see other user’s uploaded files. Only “administrators” can see everyone’s uploaded files. Administrators will review the uploaded files and notify users if they are to be accepted or not (granted registration complete status or require more documentation).

Right now, for security I’m going down the path to store the uploaded files into the actual MySQL DB as “longblob” values, and then calling them up via grids/ forms. However, as I research the web, it seems that storing PDFs/ JPEGs inside MySQL is a bad idea because of performance and other issues. I will likely have 4000 users upload about 10mb of files, so I could have a MySQL DB theoretically grow to 40GB.

Again, my number one concern is restricting a user (or web robot) from downloading/ browsing other people’s uploaded files. I’m concerned about just dumping files in some /tmp directory. The entire webpage would be behind a login procedure (home grown, not scriptcase) but I don’t know how to control someone from browsing the /tmp and other upload directories, even at an apache level.

Please advise.

Thanks!

-Aaron

In the current scriptcase distribution the directory and the filenames are in the generated html page. Just show the sourcecode of the page. You can change the directory using tools like tamper. So in this situation I cannot advise your approach. We have reported this issue and we already have a patch from scriptcase so I expect this issue to be solved in one of the next releases.
It is not a good idea to store large files in mysql. It’s not a performance issue, but a maintenance issue. If you ever have to dump your database, i.e. for moving it to another server you run into big issues. If you run apache you can set .htaccess to only allow downloads from php. Then the directory cannot be harvested by google. For IE there are similare solutions I’m sure, but I never used them. Hope this helps a bit.

Albert,

Thanks for the reply. So you’re saying I should store the files outside of the database because of maintenance, which I agree with. However, what controls can I do to prevent people from browsing an entire directory of uploaded files?

http://stackoverflow.com/questions/2530372/how-do-i-disable-directory-browsing

Just use .htaccess on apache or web.config for a similar behaviour on IIS.
In fact we have all directory browsing disabled over the whole server.

Hello,
I am not a ScriptCase expert, but developer. And as @aducom say, maintenance is really difficult for large files. In other hand, databases like SQL Server manages/serves tiny files very well when they are stored at database layer even faster than filesystem. So, whatever be the case, you can do the following (I will give you and example of a architecture).

1. Do not store file in mysql, but filename. In the database you could save the relations between files are user, referring just the filename, take a look to the table:

uploads_table

id_store | filename | mime_type | user | status

1 | foo.jpg | image/jpeg | 1 | in_revision
2 | bar.pdf | application/pdf | 2 | accepted

See how the user 1 only has access to foo.jpg file, and user 2 only have access to bar.pdf file.

2. Restriction for download. Some people is evil, and share the link for resources like example.com/files/bar.pdf, I mean, the direct path to the file. This is wrong, and there is not way control who can access the file or not. So, the trick here is to make you own download manager calling a script instead a file. How can I do that? A blank application could be the answer, so the idea is download links like this: example.com/download?name=bar.pdf or example.com/download?name=foo.jpg


// step 1, read the session to know who is accesing this file
$userGroup = $_SESSION['USER_GROUP'];
$userID= $_SESSION['USER_ID'];

// step 2, is this guy and administrator?
if ($userGroup === 'administrator'){
    // no restriction, give him the resources, which is the name of the file?
    $filename = $_GET['name'];
    $path = $this->Ini->path_docs . $filename;
    $mimeType = execute_query("select mime_type from uploads_table where name = '{$name}' ");
    buildDownload($path, $mimeType);
}
else{
    // if this guy is not admin, then check if he has the perfmission to file
    $row = execute_query("select * from uploads_table where name = '{$name}' and user = {$userID}");
    if ($row != null){
        // cool! he has permission
        $filename = $_GET['name'];
        $path = $this->Ini->path_docs . $filename;
        $mimeType = $row['mime_type'];
        buildDownload($path, $mimeType);
    }
    else{
        echo "sorry, this file is not available";
    }
}

function buildDownload($path, $mimeType){
    header('Pragma: public');
    header('Expires: 0');
    header('Cache-Control: must-revalidate, post-check=0, pre-check=0');
    header('Cache-Control: private', false);
    header('Content-Type: '. $mimeType . '');
    header('Content-Disposition: attachment; filename="'. basename($path ) . '";');
    header('Content-Transfer-Encoding: binary');
    header('Content-Length: ' . filesize($path ));
    readfile($path );
    exit;
}


3. Administrators: Administrator can see every single uploaded file, so you can easily make a grid selecting the records from uploads_table. The magic here is to make a new extra field called “Download” where you can build a grid column allowing the administrator to download the resource and review it, for example, at onRecord() event you can do:


// link to blank application with file requested
{download} = "<a href='download?name=[filename]'>[filename]</a>";

4. Logged users and visitors. Take this case, a user has accepted file and then he share the link example.com/download?name=foo.jpg to his friend. The system will try to read the session of that visitor, when the sql query is executed never get a row, because there is not id session corresponding to the requested file. Between logged users happens the same, the ID read from session do not match with the file associated at the db record.

This is a basic model of resources permissions, with a cool corrections could be very robust link caring from evil people. Take in consideration that:

  1. The code is pseudo-code, it is just giving you the path to follow
  2. Never use that code in production, should be validated before
  3. I am calling functions that do not exist, it is just an example of similar functions that could called at that point.

Advantages:

  1. Files easy to migrate. If you change your server, just move your doc’s directory
  2. Easy to improve functionalities. If eventually you need to track downloaded files, count downloads per file, etc; just inject code into Download application to fit your needs.

Hi, manfred.

You can implement what you want using security based on groups, just only need that, saving a lot of line of code and time, actions can be controlled by configuring the role. controlling the download action would be a little tricky but is possible.

Thanks guys! I’m working on a version similar to what Manfred suggested.