Nepomuk/FileIndexer: Difference between revisions

From KDE UserBase Wiki
(formatting change)
(Marked this version for translation)
Line 2: Line 2:


<translate>
<translate>
<!--T:1-->
{{Note|1=This page documents the Nepomuk File Indexer and how it can be configured.}}
{{Note|1=This page documents the Nepomuk File Indexer and how it can be configured.}}


== File Indexer ==
== File Indexer == <!--T:2-->


<!--T:3-->
Nepomuk serves as the primary file indexer for the KDE Workspace.
Nepomuk serves as the primary file indexer for the KDE Workspace.


=== Architecture ===
=== Architecture === <!--T:4-->


<!--T:5-->
With KDE Workspace 4.10, the Nepomuk File Indexer indexes the files in two phases. The first phase, called Basic Indexing, just extracting the filename, modification date, and mimetype. The second phase is responsible for looking inside the file and extracting information such as the Artist, Album and Title.
With KDE Workspace 4.10, the Nepomuk File Indexer indexes the files in two phases. The first phase, called Basic Indexing, just extracting the filename, modification date, and mimetype. The second phase is responsible for looking inside the file and extracting information such as the Artist, Album and Title.


<!--T:6-->
By default the Basic Indexing always runs, and can be controlled by suspending the Nepomuk File Indexer via the Nepomuk Controller. File Indexing is only performed, by default, when the user is idle.
By default the Basic Indexing always runs, and can be controlled by suspending the Nepomuk File Indexer via the Nepomuk Controller. File Indexing is only performed, by default, when the user is idle.


==== Changing the default behavior ====
==== Changing the default behavior ==== <!--T:7-->


<!--T:8-->
Based on your requirements, the user can change the default behavior and allow the FileIndexer to always run. This default behavior can be changed by editing the <tt>nepomukstrigirc</tt> and adding the following options -
Based on your requirements, the user can change the default behavior and allow the FileIndexer to always run. This default behavior can be changed by editing the <tt>nepomukstrigirc</tt> and adding the following options -


<!--T:9-->
<syntaxhighlight lang=ini>
<syntaxhighlight lang=ini>
[Indexing]
[Indexing]
Line 26: Line 32:




<!--T:10-->
The following options can be changed -
The following options can be changed -


<!--T:11-->
* BasicIQDelay - By default, the basic indexing queue doesn't wait between files. An artificial delay can be introduced by this parameter.
* BasicIQDelay - By default, the basic indexing queue doesn't wait between files. An artificial delay can be introduced by this parameter.


<!--T:12-->
* FileIQDelay - File Indexing is an intensive process. One might want a delay between files, so that the indexing runs slower.
* FileIQDelay - File Indexing is an intensive process. One might want a delay between files, so that the indexing runs slower.


<!--T:13-->
* NormalMode_FileIndexing - This can be either "suspend" or "resume". If it is set to "resume" then the file indexer, will always be run during normal indexing.
* NormalMode_FileIndexing - This can be either "suspend" or "resume". If it is set to "resume" then the file indexer, will always be run during normal indexing.


=== Startup Scan ===
=== Startup Scan === <!--T:14-->


<!--T:15-->
On starting the Nepomuk File Indexer, it scans through all the all the files marked for indexing and checks if they have been modified. This scan on startup may take some amount of time. By default it is not configurable. It can however, be avoided by adding this parameter to the <tt>nepomukstrigirc</tt>.
On starting the Nepomuk File Indexer, it scans through all the all the files marked for indexing and checks if they have been modified. This scan on startup may take some amount of time. By default it is not configurable. It can however, be avoided by adding this parameter to the <tt>nepomukstrigirc</tt>.


<!--T:16-->
<syntaxhighlight lang=ini>
<syntaxhighlight lang=ini>
[General]
[General]
Line 44: Line 56:




<!--T:17-->
This will disable the startup scan of all the indexed files.
This will disable the startup scan of all the indexed files.


=== File Indexing Errors ===
=== File Indexing Errors === <!--T:18-->


<!--T:19-->
Due to bugs and incorrect files one occasionally might encounter files which cannot be indexed. One can log the indexing errors in that case by changing the following parameter in <tt>nepomukstrigirc</tt>.
Due to bugs and incorrect files one occasionally might encounter files which cannot be indexed. One can log the indexing errors in that case by changing the following parameter in <tt>nepomukstrigirc</tt>.


<!--T:20-->
<syntaxhighlight lang=ini>
<syntaxhighlight lang=ini>
[General]
[General]
Line 56: Line 71:




<!--T:21-->
This will cause all the file errors to be written to the <tt>$KDEDIR/share/data/nepomuk/file-indexer-error.log</tt> file. You might want to check this file and report the errors by uploading the relevant file and error on http://bugs.kde.org
This will cause all the file errors to be written to the <tt>$KDEDIR/share/data/nepomuk/file-indexer-error.log</tt> file. You might want to check this file and report the errors by uploading the relevant file and error on http://bugs.kde.org


=== File Formats ===
=== File Formats === <!--T:22-->


<!--T:23-->
With the KDE Workspace 4.10 release, we no longer rely on Strigi for file indexing. We now rely on our own home-grown indexer which use libraries already heavily used within KDE.
With the KDE Workspace 4.10 release, we no longer rely on Strigi for file indexing. We now rely on our own home-grown indexer which use libraries already heavily used within KDE.


<!--T:24-->
In 4.10, we support most Image, Video, and Audio formats. We are however lacking in Document Formats and only support PDF. If you can encounter some file which you think has not been indexed, you can manually index it by manually running the following command <code>nepomukindexer '''''fileUrl'''''</code>. Make sure you have Nepomuk debug messages. If the file has been successfully indexed, and Nepomuk has not managed to successfully extract the required information, then please file a bug report with the relevant details.
In 4.10, we support most Image, Video, and Audio formats. We are however lacking in Document Formats and only support PDF. If you can encounter some file which you think has not been indexed, you can manually index it by manually running the following command <code>nepomukindexer '''''fileUrl'''''</code>. Make sure you have Nepomuk debug messages. If the file has been successfully indexed, and Nepomuk has not managed to successfully extract the required information, then please file a bug report with the relevant details.


<!--T:25-->
[[Category:System]]
[[Category:System]]
</translate>
</translate>

Revision as of 16:07, 30 January 2013

Other languages:

Note

This page documents the Nepomuk File Indexer and how it can be configured.


File Indexer

Nepomuk serves as the primary file indexer for the KDE Workspace.

Architecture

With KDE Workspace 4.10, the Nepomuk File Indexer indexes the files in two phases. The first phase, called Basic Indexing, just extracting the filename, modification date, and mimetype. The second phase is responsible for looking inside the file and extracting information such as the Artist, Album and Title.

By default the Basic Indexing always runs, and can be controlled by suspending the Nepomuk File Indexer via the Nepomuk Controller. File Indexing is only performed, by default, when the user is idle.

Changing the default behavior

Based on your requirements, the user can change the default behavior and allow the FileIndexer to always run. This default behavior can be changed by editing the nepomukstrigirc and adding the following options -

[Indexing]
BasicIQDelay=0
FileIQDelay=0
NormalMode_FileIndexing=suspend


The following options can be changed -

  • BasicIQDelay - By default, the basic indexing queue doesn't wait between files. An artificial delay can be introduced by this parameter.
  • FileIQDelay - File Indexing is an intensive process. One might want a delay between files, so that the indexing runs slower.
  • NormalMode_FileIndexing - This can be either "suspend" or "resume". If it is set to "resume" then the file indexer, will always be run during normal indexing.

Startup Scan

On starting the Nepomuk File Indexer, it scans through all the all the files marked for indexing and checks if they have been modified. This scan on startup may take some amount of time. By default it is not configurable. It can however, be avoided by adding this parameter to the nepomukstrigirc.

[General]
disable initial update=true


This will disable the startup scan of all the indexed files.

File Indexing Errors

Due to bugs and incorrect files one occasionally might encounter files which cannot be indexed. One can log the indexing errors in that case by changing the following parameter in nepomukstrigirc.

[General]
debug mode=true


This will cause all the file errors to be written to the $KDEDIR/share/data/nepomuk/file-indexer-error.log file. You might want to check this file and report the errors by uploading the relevant file and error on http://bugs.kde.org

File Formats

With the KDE Workspace 4.10 release, we no longer rely on Strigi for file indexing. We now rely on our own home-grown indexer which use libraries already heavily used within KDE.

In 4.10, we support most Image, Video, and Audio formats. We are however lacking in Document Formats and only support PDF. If you can encounter some file which you think has not been indexed, you can manually index it by manually running the following command nepomukindexer fileUrl. Make sure you have Nepomuk debug messages. If the file has been successfully indexed, and Nepomuk has not managed to successfully extract the required information, then please file a bug report with the relevant details.