User:Claus chr/Searching: Difference between revisions

    From KDE UserBase Wiki
    (Created page with "== Searching == === The problems === There are (at least) three distinct problems with searching UserBase; they are (in order of increasing obscurity to regular readers and ...")
     
    No edit summary
    Line 10: Line 10:
    === The search box ===
    === The search box ===


    The problem: The query searches both page names and page content (by default only the main namespace). This means you will often get hits for pages in many different languages, and the search results are presented in no particular order, so frequently the results are if not useless at least unreasonably difficult to use. I'm not even sure we find all relevant hits. Last time I did a search for "Amarok/Man" I got two pages worth of results (about 40 pages in all) most of them translated pages. The search ought to have found all the Amarok Manual pages (and all their translations). In fact, only two English manual pages were found: Amarok/Manual/Various/FAQ and Amarok/Manual/Various/FAQ/en! (As an interesting side note, the search box query did find some pages containing a Special:myLanguage link!)
    The problem: The query searches both page names and page content (by default only the main namespace). This means you will often get hits for pages in many different languages, and the search results are presented in no particular order, so frequently the results are if not useless at least unreasonably difficult to use.  


    {{Note|This needs to be rechecked. My feeling is, that we usually get far more hits, so maybe this is a transient phenomenon. It's worrying none the less.}}
    I'm not even sure we find all relevant hits. Last time I did a search for "Amarok/Man" I got two pages worth of results (about 40 pages in all) most of them translated pages. The search ought to have found all the Amarok Manual pages (and all their translations). In fact, only two English manual pages were found: Amarok/Manual/Various/FAQ and Amarok/Manual/Various/FAQ/en! (As an interesting side note, the search box query did find some pages containing a Special:myLanguage link!) (See following Note box.)
     
    {{Note|This needs to be rechecked. My feeling is, that we usually get far more hits, so maybe this is a transient phenomenon. It's worrying none the less.<br /><br />Update: The problem described above was caused by my searching on an incomplete page name. It seems, that is this case only the page contents is searched. Searching Amaroc/Manual/ gives a lot of hits in many languages - including English!}}


    Going to the advanced search page is no help at all; that just gives us the option to cast an even wider net, searching more namespaces. (Btw. translated pages all live in the main namespace. The namespace Translation holds individual translated units.)
    Going to the advanced search page is no help at all; that just gives us the option to cast an even wider net, searching more namespaces. (Btw. translated pages all live in the main namespace. The namespace Translation holds individual translated units.)

    Revision as of 10:44, 7 August 2012

    Searching

    The problems

    There are (at least) three distinct problems with searching UserBase; they are (in order of increasing obscurity to regular readers and utility to writers and administrators):

    • Using the search box gives unhelpful and possibly incomplete results
    • Using What links here doesn't find Special:myLanguage links
    • Using DPL yields incomplete results

    The search box

    The problem: The query searches both page names and page content (by default only the main namespace). This means you will often get hits for pages in many different languages, and the search results are presented in no particular order, so frequently the results are if not useless at least unreasonably difficult to use.

    I'm not even sure we find all relevant hits. Last time I did a search for "Amarok/Man" I got two pages worth of results (about 40 pages in all) most of them translated pages. The search ought to have found all the Amarok Manual pages (and all their translations). In fact, only two English manual pages were found: Amarok/Manual/Various/FAQ and Amarok/Manual/Various/FAQ/en! (As an interesting side note, the search box query did find some pages containing a Special:myLanguage link!) (See following Note box.)

    Note

    This needs to be rechecked. My feeling is, that we usually get far more hits, so maybe this is a transient phenomenon. It's worrying none the less.

    Update: The problem described above was caused by my searching on an incomplete page name. It seems, that is this case only the page contents is searched. Searching Amaroc/Manual/ gives a lot of hits in many languages - including English!


    Going to the advanced search page is no help at all; that just gives us the option to cast an even wider net, searching more namespaces. (Btw. translated pages all live in the main namespace. The namespace Translation holds individual translated units.)

    What we should wish for in a general search:

    • Only results in the readers language by default
    • Some sort of Google-like prioritizing of results (?)
    • The option to specify desired language(s) as well as namespaces on the advanced page.
    • Search results should be comprehensive (fx, all Amarok manual pages should be found)

    What Links Here

    In its current form What Links Here is simply useless - it doesn't know about Translate links (Special:myLanguage), so it doesn't pick up most pages. Only lingering old style links are found (mostly on translated pages). We need a Translate-aware What Links Here replacement.

    DPL

    In theory DPL should be able to overcome the limitations of both of the other two options. Sadly that doesn't hold in the real world. For some queries it doesn't find all matching pages. Experiments shown on User:Claus chr/DPL/Test seems to indicate that some fixed capacity is exceeded. Sometimes performing a broad search on a pattern gives fewer results than searching for the same pattern in a narrower range of pages (fx searching only User namespace as opposed to searching both User and main namespace in one query). The results are very reproducible; always the same hits (and misses).

    Note

    The text on User:Claus chr/DPL/Test reflects my thoughts on the problem as they evolved in response to the experiments, so they should be most accurate (and confusing) towards the end of the page. Also the description does not match the results currently displayed; this is just because more pages have been added to UserBase since the test page was written. (Yes, more pages can lead to fewer hits - see above)