IFilter of IStream



  • Indicating in my annex office documents (MS Office 03, 07). If you're loaded IFilter using a specific file, all OK, the filter works as follows:

    HRESULT hr_f = LoadIFilter(filename, 0, (void **)&pFilter);
    

    However, initialization from the buffer:

    HRESULT hr_ss = BindIFilterFromStream(spStream, 0, (void **)&pFilter);
    

    returns E_FAILand pFilternaturally, it doesn't work. From IStream I've inherited, the methods have been implemented, the main thing that is needed to initialize the right flame, I suspect, is in the method:

    HRESULT StreamFilter::Stat(STATSTG * pstatstg, DWORD grfStatFlag)
    {
       //Microsoft Office Ifilter from Windows Registry
       const IID CLSID_IFilter = {
           0xf07f3920,
           0x7b8c,
           0x11cf,
           { 0x9b, 0xe8, 0x00, 0xaa, 0x00, 0x4b, 0x99, 0x86 }
    
       //{f07f3920-7b8c-11cf-9be8-00aa004b9986}
    

    };
    LARGE_INTEGER pSize;
    int fl = GetFileSizeEx(_hFile, &pSize);
    memset(pstatstg, 0, sizeof(STATSTG));
    pstatstg->clsid = CLSID_IFilter;
    pstatstg->type = STGTY_STREAM;
    pstatstg->cbSize.QuadPart = pSize.QuadPart;

    return S_OK;
    }

    Options for initializing the structure pstatstg I've tried different, it's useless. ♪
    After calling this method, judging by the set of calls goes to the query.dll and comes from there. E_FAIL♪ I don't know what else might be necessary.

    There's a similar question, https://stackoverflow.com/questions/7313828/using-ifilter-in-c-sharp-and-retrieving-file-from-database-rather-than-file-syst/19504946?s=1%7C0.1214#19504946 and the method described for *.pdf really works on the plus. But unfortunately, MSO doesn't fit.



  • Anyway, after a long search, I'm inspired by a solution. https://stackoverflow.com/questions/7313828/using-ifilter-in-c-sharp-and-retrieving-file-from-database-rather-than-file-syst/19504946?s=1%7C0.1214#19504946 , drank his crutches that work as I need.

    Let the system choose the right xandler to expand the processed file:

    HRESULT hr = LoadIFilter(L".doc", 0, (void **)&pFilter);
    

    Then we need to initiate. IStream*:

    IPersistStream *stream;
    HRESULT hr_qi = pFilter->QueryInterface(&stream);
    

    std::ifstream ifs(filename, ios::binary);
    std::string content((std::istreambuf_iterator<char>(ifs)),
    (std::istreambuf_iterator<char>()));

    IStream *comStream;
    HGLOBAL hMem = ::GlobalAlloc(GMEM_MOVEABLE, content.size());
    LPVOID pDoc = ::GlobalLock(hMem);
    memcpy(pDoc, content.c_str(), content.size());
    ::GlobalUnlock(hMem);
    HRESULT hr_mem = ::CreateStreamOnHGlobal(hMem, true, &comStream);
    HRESULT hr_stream_load = stream->Load(comStream);

    Then we work with the filter, as in MSDN or GitHub examples:

    if (SUCCEEDED(hr))
    {
    DWORD flags = 0;
    HRESULT hr = pFilter->Init(IFILTER_INIT_INDEXING_ONLY |
    IFILTER_INIT_APPLY_INDEX_ATTRIBUTES |
    IFILTER_INIT_APPLY_CRAWL_ATTRIBUTES |
    IFILTER_INIT_FILTER_OWNED_VALUE_OK |
    IFILTER_INIT_APPLY_OTHER_ATTRIBUTES,
    0, 0, &flags);
    if (FAILED(hr))
    {
    pFilter->Release();
    throw exception("IFilter::Init() failed");
    }

    Start();

    STAT_CHUNK stat;
    while (SUCCEEDED(hr = pFilter->GetChunk(&stat)))
    {
    if ((stat.flags & CHUNK_TEXT) != 0)
    ProcessTextChunk(pFilter, stat);

     if ((stat.flags &amp; CHUNK_VALUE) != 0)
        ProcessValueChunk(pFilter, stat);
    

    }

    Finish();

    pFilter->Release();
    }
    else
    {
    throw exception("LoadIFilter() failed");
    }

    It should be emphasized that, in this situation, there is no need to implement its version. IStream*, unless you write Windows Search.




Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2