Azure Data Lake storage file listing (GetPathsAsync does not return)

0

I am having troubles in listing the directories in in the Azure Data Lake storage. I am pretty much using the default template for listing the directories on a file system according to Azure Data Lake Storage Gen2. I have wrapped the code up into a unit test, but I get some sort of a JSON serialization issue. This is the asnyc task I am calling from a unit test method.

public async Task<List<string>> ListFilesInDirectory(string Directory)
{
    IAsyncEnumerator<PathItem> enumerator =
        _dataLakeFileSystemClient.GetPathsAsync(Directory).GetAsyncEnumerator();
    await enumerator.MoveNextAsync();
    List<string> somelist = new List<string>();

    PathItem item = enumerator.Current;
    while (item != null)
    {
        Console.WriteLine(item.Name);
        somelist.Add(item.Name);
        if (!await enumerator.MoveNextAsync())
        {
            break;
        }

        item = enumerator.Current;
    }

    return somelist;
}

This is my error message.

System.AggregateException HResult=0x80131500 Message=One or more errors occurred. ('<' is an invalid start of a value. LineNumber: 0 | BytePositionInLine: 0.) Source=System.Private.CoreLib StackTrace: at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions) at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken) at System.Threading.Tasks.Task.Wait() at HistoricMarketDataLibTests.HistoricIceDataManagerTests.DirectoryListing(String Directory) in C:\Users\bbf22\source\repos\HistoricMarketDataClient\BarDefinitionTests\HistoricIceDataManagerTests.cs:line 143

This exception was originally thrown at this call stack: System.Text.Json.ThrowHelper.ThrowJsonReaderException(ref System.Text.Json.Utf8JsonReader, System.Text.Json.ExceptionResource, byte, System.ReadOnlySpan) System.Text.Json.Utf8JsonReader.ConsumeValue(byte) System.Text.Json.Utf8JsonReader.ReadFirstToken(byte) System.Text.Json.Utf8JsonReader.ReadSingleSegment() System.Text.Json.Utf8JsonReader.Read() System.Text.Json.JsonDocument.Parse(System.ReadOnlySpan, System.Text.Json.Utf8JsonReader, ref System.Text.Json.JsonDocument.MetadataDb, ref System.Text.Json.JsonDocument.StackRowStack) System.Text.Json.JsonDocument.Parse(System.ReadOnlyMemory, System.Text.Json.JsonReaderOptions, byte[]) System.Text.Json.JsonDocument.Parse(System.ReadOnlyMemory, System.Text.Json.JsonDocumentOptions) System.Text.Json.JsonDocument.Parse(string, System.Text.Json.JsonDocumentOptions) Azure.Storage.Files.DataLake.ErrorExtensions.CreateException(string, Azure.Core.Pipeline.ClientDiagnostics, Azure.Response) ... [Call Stack Truncated]

Inner Exception 1: JsonReaderException: '<' is an invalid start of a value. LineNumber: 0 | BytePositionInLine: 0.

I get Item=null so it would not iterate over the Pathitems at all. I wonder whether this is related to some service request limits as the data storage is big, but then I would expect to get some sort of a reasonable error code message. I wonder what lies behind the JSON serialization issue. Is this related to the async tasks invoked? I should also mention that I can query the existence of folders or files when specified, but whenever I invoke GetPathAsync or GetPath I am getting troubles. The files involved and the number of files is large though. I wonder whether this causes some sort of a service request issue and whether I should be thinking about mapping file locations in a SQL-based backend.

I should also say, I have had a bit more of a success using R Azure RMR to query the content of some subfolders. But querying of any parents of these would take a lot of time. While I had not success in querying any subfolder in .net using the above, I still feel it may be related to the fact that some tasks are not returning the results in time? But then again, this is just me making uneducated guesses.

I would appreciate any help. Far from being an expert on Azure .NET APIs and async tasks.

c#
.net
azure
azure-data-lake-gen2
iasyncenumerable
asked on Stack Overflow Jun 9, 2020 by Gkhan Cebs • edited Jun 10, 2020 by Theodor Zoulias

0 Answers

Nobody has answered this question yet.


User contributions licensed under CC BY-SA 3.0