Nepomuk/FileIndexer

From KDE UserBase Wiki
Revision as of 18:19, 28 January 2013 by Yurchor (talk | contribs)
Other languages:

Note

This page documents the Nepomuk File Indexer and how it can be configured.


File Indexer

Nepomuk serves as the primary file indexer for the KDE Workspace.

Architecture

With KDE Workspace 4.10, the Nepomuk File Indexer indexes the files in two phases. The first phase, called Basic Indexing, just extracting the filename, modification date, and mimetype. The second phase is responsible for looking inside the file and extracting information such as the Artist, Album and Title.

By default the Basic Indexing always runs, and can be controlled by suspending the Nepomuk File Indexer via the Nepomuk Controller. File Indexing is only performed, by default, when the user is idle.

Changing the default behavior

Based on your requirements, the user can change the default behavior and allow the FileIndexer to always run. This default behavior can be changed by editing the nepomukstrigirc and adding the following options -

[Indexing]
BasicIQDelay=0
FileIQDelay=0
NormalMode_FileIndexing=suspend


The following options can be changed -

  • BasicIQDelay - By default, the basic indexing queue doesn't wait between files. An artificial delay can be introduced by this parameter.
  • FileIQDelay - File Indexing is an intensive process. One might want a delay between files, so that the indexing runs slower.
  • NormalMode_FileIndexing - This can be either "suspend" or "resume". If it is set to "resume" then the file indexer, will always be run during normal indexing.

Startup Scan

On starting the Nepomuk File Indexer, it scans through all the all the files marked for indexing and checks if they have been modified. This scan on startup may take some amount of time. By default it is not configurable. It can however, be avoided by adding this parameter to the nepomukstrigirc.

[General]
disable initial update=true


This will disable the startup scan of all the indexed files.

File Indexing Errors

Due to bugs and incorrect files one occasionally might encounter files which cannot be indexed. One can log the indexing errors in that case by changing the following parameter in nepomukstrigirc.

[General]
debug mode=true


This will cause all the file errors to be written to the $KDEDIR/share/data/nepomuk/file-indexer-error.log file. You might want to check this file and report the errors by uploading the relevant file and error on http://bugs.kde.org

File Formats

With the KDE Workspace 4.10 release, we no longer rely on Strigi for file indexing. We now rely on our own home-grown indexer which use libraries already heavily used within KDE.

In 4.10, we support most Image, Video, and Audio formats. We are however lacking in Document Formats and only support PDF. If you can encounter some file which you think has not been indexed, you can manually index it by manually running the following command nepomukindexer <fileUrl>. Make sure you have Nepomuk debug messages. If the file has been successfully indexed, and Nepomuk has not managed to successfully extract the required information, then please file a bug report with the relevant details.