-
Notifications
You must be signed in to change notification settings - Fork 855
Description
Hi Globo Team,
I think there's an issue on upload module when it is used with the file storage.
Indeed, when you upload an image on thumbor :
curl -XPOST -F '[email protected];filename=cat-eye.jpg' http://thumbor-server/upload
The response would be :
HTTP/1.1 201 Created
Content-Length: 22
Content-Type: text/html; charset=UTF-8
Location: 2012/08/12/cat-eye.jpg
Server: TornadoServer/2.1.1
and the image is created on a file system :
└── 2012
└── 08
└── 12
└── cat-eye.jpg
If we want to replace the image the next day (ie 2012-08-13), we make this request :
curl -XPUT -F '[email protected];filename=cat-eye.jpg' http://thumbor-server/upload
The response would be :
HTTP/1.1 201 Created
Content-Length: 22
Content-Type: text/html; charset=UTF-8
Location: 2012/08/13/cat-eye.jpg
Server: TornadoServer/2.1.1
and a second image is created on a file system instead of replace the first image :
└── 2012
└── 08
├── 12
│ └── cat-eye.jpg
└── 13
└── cat-eye.jpg
This issue is due to the implementation of the distribution algorithm of the file storage which is based on time as discussed in the issue #113.
So I think we have to change the strategy for filesystem distribution.
We may use a strategy similar to the strategy used by Git to store his objects using the 2 first digit of sha1(path) to create a directory and the remaining to create the file.
Following this strategy the normalize_path in the file_storage.py should be :
def normalize_path(self, path):
digest = hashlib.sha1(path).hexdigest()
return join(self.context.config.FILE_STORAGE_ROOT_PATH.rstrip('/'), digest[:2] + '/' + digest[2:])With this strategy files should be distributed like that on the filesystem :
├── 6e
│ └── 7ea22ec6a03708fc2ac674580ee2c2fed26f36
├── 73
│ └── dc4c10a915fb41578a0e9dcaf3a99d53e2a785
├── 75
│ └── 47efb441e2bc461b54603e584cd936745b5935
├── 77
│ └── 4555f047b92136a4e65b7f5034f8faeb79a76b
├── 78
│ ├── 4008d66d4b9675a58e5b8faa2ec09b0c7bdb49
│ ├── 9260f3a7034ca116e063388cc33e65941d398b
│ └── b545ac7b9d8ac6d15a7bd2b54d42795c3405ad
So if we choose this implementation we should remove the resolve_original_photo_path from the file_storage.py and implement the normalize_path like above.
More generally each storage system MAY implements a method path_on_storage according to its constraints.
Nicolas