MSBuild is failing inconsistently when performing a TFS build (usually error C1093 / Not enough Storage)

7

I have a really odd and hard to diagnose issue with MSBuild / TFS. I have a solution that contains about 12 different build configurations. When running on the build server, it takes maybe 30mins to build the lot and has worked fine for weeks now but now is occasionally failing.

Most of the time, when it fails it'll be an error like this:

19:25:45.037 2>TestPlanDocument.cpp(1): fatal error C1093: API call 'GetAssemblyRefHash' failed '0x8007000e' : ErrorMessage: Not enough storage is available to complete this operation. [C:\Builds\1\ICCSim Card Test Controller\ICCSimCTC Release\src\CardTestController\CardTestController.vcxproj]

The error will sometimes happen on a different file. It won't happen for every build configuration either, it's very inconsistent and occasionally even builds all of them successfully. There's not much different between the build configurations either, mostly it's just a few string changes and of course they all build locally just fine.

The API call in question is usually GetAssemblyRefHash but not always. I don't think this is the issue, as Googling for GetAssemblyRefHash specifically brings up next to nothing. I suspect there's some kind of resource issue at play here but I'm at a loss as to what: There's plenty of HDD space (Hundreds of GB's), plenty of RAM (Machine originally had 4GB minimum allocated but was dynamic as it's a Hyper-v - it never pushed above 2.5GB. I upped this to 8GB minimum just in case and there's been no change).

I've set the build verbosity to diagnostic and it doesn't really show anything else that's helpful, just the same error.

For reference, the build server is fully up to date on all patches. It's running Windows Server 2012 R2, has TFS 2013 and VS 2013 installed, both are on Update 4.

I'm really at a loss at this point and would appreciate any help or pointers.

EDIT: Just to keep people up to date, the compile toolchain was in 32bit mode however even after switching to 64bit, the issue persists.

tfs
visual-studio-2013
msbuild
asked on Stack Overflow Dec 17, 2014 by Kushan • edited Dec 22, 2014 by Kushan

2 Answers

2

I think I found the source, but I still don't know the reason.

Browsing through the Microsoft Shared Source, we can find the source for GetAssemblyRefHash():

HRESULT CAsmLink::GetAssemblyRefHash(mdToken FileToken, const void** ppvHash, DWORD* pcbHash)
{
    if (TypeFromToken(FileToken) != mdtAssemblyRef) {
        VSFAIL( "You can only get AssemblyRef hashes for assemblies!");
        return E_INVALIDARG;
    }

    HRESULT hr;
    CAssembly *file = NULL;
    if (FAILED(hr = m_pImports->GetFile( FileToken, (CFile**)&file)))
        return hr;

    return file->GetHash(ppvHash, pcbHash);
}

Only two places here to investigate - the call to m_pImports->GetFile(), where m_pImports is CAssembly *m_pImports;, the other is file->GetHash().

m_pImports->GetFile() is here, and is a dead end:

HRESULT CAssembly::GetFile(DWORD index, CFile** file)
{
    if (!file)
        return E_POINTER;

    if (RidFromToken(index) < m_Files.Count()) {
        if ((*file = m_Files.GetAt(RidFromToken(index))))
            return S_OK;
    }
    return ReportError(E_INVALIDARG);
}

file->GetHash(), which is here:

HRESULT CAssembly::GetHash(const void ** ppvHash, DWORD *pcbHash)
{
    ASSERT( ppvHash && pcbHash);
    if (IsInMemory()) {
        // We can't hash an InMemory file
        *ppvHash = NULL;
        *pcbHash = 0;
        return S_FALSE;
    }

    if (!m_bDoHash || (m_cbHash && m_pbHash != NULL)) {
        *ppvHash = m_pbHash;
        *pcbHash = m_cbHash;
        return S_OK;
    }

    DWORD cchSize = 0, result;

    // AssemblyRefs ALWAYS use CALG_SHA1
    ALG_ID alg = CALG_SHA1;
    if (StrongNameHashSize( alg, &cchSize) == FALSE)
        return ReportError(StrongNameErrorInfo());

    if ((m_pbHash = new BYTE[cchSize]) == NULL)
        return ReportError(E_OUTOFMEMORY);
    m_cbHash = cchSize;

    if ((result = GetHashFromAssemblyFileW(m_Path, &alg, (BYTE*)m_pbHash, cchSize, &m_cbHash)) != 0) {
        delete [] m_pbHash;
        m_pbHash = 0;
        m_cbHash = 0;
    }
    *ppvHash = m_pbHash;
    *pcbHash = m_cbHash;

    return result == 0 ? S_OK : ReportError(HRESULT_FROM_WIN32(result));
}

We can see that about halfway down, it tries to allocate room to store the byte[] result, and when fails, returns E_OUTOFMEMORY, which is the error code you're seeing:

if ((m_pbHash = new BYTE[cchSize]) == NULL)
    return ReportError(E_OUTOFMEMORY);
m_cbHash = cchSize;

There's other paths to consider, but this seems like the most obvious source. So it looks like the problem is that a plain memory allocation is failing.

What could cause this?

  • Lack of free physical memory pages / swap
  • Memory fragmentation in the process.
  • Inability to reserve commit space for this in the swap file
  • Lack of address space

At this point, my best guess would be memory fragmentation. Have you triple checked that the Microsoft CPP compiler is running in 64-bit mode? Perhaps see if you can debug the compiler (Microsoft symbol servers may be able to help you here), and set a breakpoint for that line and dump the heap when it happens.

Some specifics on diagnosing heap fragmentation - fire up sysinternal's VMMap when the compiler breaks, and look at the free list - you need three chunks at least 64 kB free to perform an allocation; less than 64 kB and it won't get used, and two 64 kB chunks are reserved.

answered on Stack Overflow Dec 19, 2014 by antiduh • edited Dec 19, 2014 by antiduh
0

Okay, I have an update to this! I opened a support ticket with Microsoft and have been busy working with them to figure out the issue. They went down the same paths as outlined above and came to the same conclusion - it's not a resources issue.

To cut a long story short, Microsoft has now acknowledged that this is likely a bug in the VC++ compiler, which is almost certainly caused by a race condition (Though this is unconfirmed). There's no word on if they'll fix it in a future release.

There is a workaround by using the /MP flag at the project level to limit the number of compiler processes opened by MSBuild without disabling multiple instances entirely (Which for me was doubling build times).

To do this, go to your project properties and under Configuration Properties -> C/C++ -> Command Line, you need to specify the /MP flag and then a number to limit the number of processes.

Setting the /MP flag

My build server has 8 Virtual CPU's and the normal behaviour is equivelant to /MP8 but this causes the bug to sometimes appear. For me, using /MP4 seems to be enough to limit the bug without causing build times to increase too much. If you're seeing a bug similar to this, you may need to experiment with other numbers such as /MP6 or /MP2.

answered on Stack Overflow Feb 13, 2015 by Kushan

User contributions licensed under CC BY-SA 3.0