How to iterate through and delete certain files from Python fcache?

0

In my PyQt5 app, I've been using fache (https://pypi.org/project/fcache/) to cache lots of small files to the user's temp folder for speed. It's working well for caching, but now I need to be able to iterate through the cached files and selectively delete files that are no longer needed.

However when I try to iterate through the FileCache object, I'm getting an error.

thisCache is the name of my cache, and if I print(thisCache) I get: which is fine.

Then if I do print(thisCache.keys()) I get KeysView(<fcache.cache.FileCache object at 0x000001F7BA0F2848>), which seems correct (I think?). Similarly, printing .values() gives me a ValuesView.

Then if I do print(len(thisCache.keys()) I get: 1903, showing that there are 1903 files in there, which is probably correct. But here's where I get stuck.

If I try to iterate through the KeysView in any way, I get an error. Each of the following attempts: for f in thisCache.values(): for f in thisCache.keys(): always throws an error: Process finished with exit code -1073740791 (0xC0000409)

I'm fairly new to Python, so am I just misunderstanding how I'm supposed to iterate through this list? Or is there a bug or gotcha here that I need to work around?

Thanks

::::::::: EDIT ::::::::

After a bit of a delay, here's a reproducile (but not especially minimal or quality) bit of example code.

import random
import string
from fcache.cache import FileCache
from shutil import copyfile

def random_string(stringLength=10):
    letters = string.ascii_lowercase
    return ''.join(random.choice(letters) for i in range(stringLength))

cacheName = "TestCache"
cache = FileCache(cacheName)

sourceFile = "C:\\TestFile.mov"
targetCount = 50

# copy the file 50 times:
for w in range(1, targetCount+1):
    fileName = random_string(50) + ".mov"
    targetPath = cache.cache_dir + "\\" + fileName
    print("Copying file ", w)
    copyfile(sourceFile, targetPath)
    cache[str(w)] = targetPath
print("Cached", targetCount, "items.")

print("Syncing cache...")
cache.sync()

# iterate through the cache:
print("Item keys:", cache.keys())
for key in cache.keys():
    v = cache[key]
    print(key, v)

print("Cache read.")

There is one dependency, which is having a file called "C:\TestFile.mov" on your system, but the path isn't important so this can be pointed to any file. I've tested with other file formats, with the same result.

The error that is thrown is:

Traceback (most recent call last): File "C:\Users\stuart.bruce\AppData\Local\Programs\Python\Python37\lib\encodings\hex_codec.py", line 19, in hex_decode return (binascii.a2b_hex(input), len(input)) binascii.Error: Non-hexadecimal digit found

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File 
"C:\Users\stuart.bruce\AppData\Local\Programs\Python\Python37\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "C:\Users\stuart.bruce\AppData\Local\Programs\Python\Python37\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\Users\stuart.bruce\PycharmProjects\testproject\test_code.py", line 32, in <module>
    for key in cache.keys():
  File "C:\Users\stuart.bruce\AppData\Local\Programs\Python\Python37\lib\_collections_abc.py", line 720, in __iter__
    yield from self._mapping
  File "C:\Users\stuart.bruce\AppData\Local\Programs\Python\Python37\lib\site-packages\fcache\cache.py", line 297, in __iter__
    yield self._decode_key(key)
  File "C:\Users\stuart.bruce\AppData\Local\Programs\Python\Python37\lib\site-packages\fcache\cache.py", line 211, in _decode_key
    bkey = codecs.decode(key.encode(self._keyencoding), 'hex_codec')
binascii.Error: decoding with 'hex_codec' codec failed (Error: Non-hexadecimal digit found)

Line 32 of test_code.py (as mentioned in the error) is the line for key in cache.keys():, so this is where it seems a non-hexidecimal character is being found. But firstly I'm not sure why, and secondly I don't know how to get around it?

(PS. Please note that if you run this code, you'll end up with 50 copies of your chosen file in your temp folder, and nothing will tidy it up automatically!)

python
caching
asked on Stack Overflow Dec 5, 2019 by Stuart Bruce • edited Dec 9, 2019 by musicamante

1 Answer

1

After reading the sources of fcache, it seems that the cache_dir should only be used by fcache itself, as it reads all its files to find previously created cache data.

The program (or, better, the module) crashes because you created the other files in that directory, and it cannot deal with them.

The solution is to use another directory to store those files.

import os

# ...

data_dir = os.path.join(os.path.dirname(cache.cache_dir), 'data')
if not os.path.exists(data_dir):
    os.mkdir(data_dir)
for w in range(1, targetCount+1):
    fileName = random_string(50) + ".mov"
    targetPath = os.path.join(data_dir, fileName)
    copyfile(sourceFile, targetPath)
    cache[str(w)] = targetPath
answered on Stack Overflow Dec 9, 2019 by musicamante

User contributions licensed under CC BY-SA 3.0