Difference between revisions of "User:Claus chr/DPL"

Line 115: Line 115:
  
 
The code of the first half of the template is here:
 
The code of the first half of the template is here:
<nowiki>{{#dpl:
+
{{Input|<nowiki>
 +
{{#dpl:
 
| namespace = Main
 
| namespace = Main
 
| nottitleregexp = .*(/..(-..)?{{!}}_[(].*[)])$
 
| nottitleregexp = .*(/..(-..)?{{!}}_[(].*[)])$

Revision as of 18:50, 17 June 2011

Reference: DPL Manual
See also Pipesmoker's notes and this page of examples
Example UI on this Template:Catlist page


Searching for pages containing a certain text string

Matching content in pages: You need to include the contents of pages in this page (include = * does that) and then do a perl-like regexp on their contents to filter interesting pages (includematch = ...). If you are searching in translated pages (fx all Danish pages) it is often advantageous to have namespace = Translations set; otherwise you will get both all full pages and all translation units containing matching text — that could be a very long output.

<DPL>
  titlematch = %/da
  namespace = Translations
  include = *
  includematch = /albummet/
  resultsheader = Danish translation units containing the string "albummet"
  format = ,\n* [[%PAGE%|%TITLE%]]\n,,
</DPL>


All English pages linking to a given page

Note
The template is under development. If this page behaves strangely, it is probably one of my experiments gone haywire. Normally, these fits should only last for a couple of minutes, but they may recur. So be warned!


<DPL>
  namespace=Main |User
  nottitleregexp = .*(/..(-..)?|_[(].*[)])$
  include = *
  includematch = #\[\[[Ss]pecial\:[mM]y[lL]anguage/Getting[_ ]Help|\[\[Getting[_ ]Help#
  includemaxlength = 0
  resultsheader = The pages in the translation system linking to Getting Help are:\n
  format = ,\n* [[%PAGE%|%TITLE%]],,
</DPL>

<DPL>

 namespace=Main|User
 nottitleregexp = .*(/..(-..)?|_[(].*[)])$
 include = *
 includematch = #\[\[([Ss]pecial\:[mM]y[lL]anguage/)?Getting[_ ]Help#
 includemaxlength = 0
 resultsheader = The pages in the translation system linking to Getting Help are:\n
 format = ,\n* %TITLE%,,

</DPL>

Somehow this is broken. Without the namespace clause we get

<DPL>

 nottitleregexp = .*(/..(-..)?|_[(].*[)])$
 include = *
 includematch = #\[\[([Ss]pecial\:[mM]y[lL]anguage/)?Getting[_ ]Help#
 includemaxlength = 0
 resultsheader = The pages in the translation system linking to Getting Help are:\n
 format = ,\n* %TITLE%,,

</DPL>

which excludes some pages in the main namespace(!) and includes on in the User namespace!? Explicitely specifying namespace main gives this.

<DPL>

 namespace = Main
 nottitleregexp = .*(/..(-..)?|_[(].*[)])$
 include = *
 includematch = #\[\[([Ss]pecial\:[mM]y[lL]anguage/)?Getting[_ ]Help#
 includemaxlength = 0
 resultsheader = The pages in the translation system linking to Getting Help are:\n
 format = ,\n* %TITLE%,,

</DPL>

That seems reasonable, but weren't main supposed to be used by default? Finally, specifying namespace User gives us all three user pages:

<DPL>

 namespace=User
 nottitleregexp = .*(/..(-..)?|_[(].*[)])$
 include = *
 includematch = #\[\[([Ss]pecial\:[mM]y[lL]anguage/)?Getting[_ ]Help#
 includemaxlength = 0
 resultsheader = The pages in the translation system linking to Getting Help are:\n
 format = ,\n* %TITLE%,,

</DPL>

All English pages linking to a given page (template version)

The lesson here seems to be that, at least when include is involved, we can't rely on DPL handling more than one namespace at a time. This calls for a template: {{LinksTo|Getting[_ ]Help}} gives

{{#dpl: | namespace = | nottitleregexp = .*((/[a-z][a-z](.|-..)?)|([ _][(][a-z][a-z](...)?[)]))$ | titleregexp = ^[A-J] | include = * | includematch = @\[\[([Ss]pecial\:[mM]y[lL]anguage/)?Getting[_ ]Help( )*(\]|\||#)@ | includemaxlength = 0 | format = ,\n* %TITLE%,, | resultsheader = There are %TOTALPAGES% pages beginning with A-J\n }} {{#dpl: | namespace = | nottitleregexp = .*((/[a-z][a-z](.|-..)?)|([ _][(][a-z][a-z](...)?[)]))$ | titleregexp = ^K[A-Z] | include = * | includematch = @\[\[([Ss]pecial\:[mM]y[lL]anguage/)?Getting[_ ]Help( )*(\]|\||#)@ | includemaxlength = 0 | format = ,\n* %TITLE%,, | resultsheader = There are %TOTALPAGES% pages beginning with KA-KZ\n }} {{#dpl: | namespace = | nottitleregexp = .*((/[a-z][a-z](.|-..)?)|([ _][(][a-z][a-z](...)?[)]))$ | titleregexp = ^K[a-ce-z] | include = * | includematch = @\[\[([Ss]pecial\:[mM]y[lL]anguage/)?Getting[_ ]Help( )*(\]|\||#)@ | includemaxlength = 0 | format = ,\n* %TITLE%,, | resultsheader = There are %TOTALPAGES% pages beginning with Ka-Kc or Ke-Kz\n }} {{#dpl: | namespace = | nottitleregexp = .*((/[a-z][a-z](.|-..)?)|([ _][(][a-z][a-z](...)?[)]))$ | titleregexp = ^Kd | include = * | includematch = @\[\[([Ss]pecial\:[mM]y[lL]anguage/)?Getting[_ ]Help( )*(\]|\||#)@ | includemaxlength = 0 | format = ,\n* %TITLE%,, | resultsheader = There are %TOTALPAGES% pages beginning with Kd\n }} {{#dpl: | namespace = | nottitleregexp = .*((/[a-z][a-z](.|-..)?)|([ _][(][a-z][a-z](...)?[)]))$ | titleregexp = ^[L-Z] | include = * | includematch = @\[\[([Ss]pecial\:[mM]y[lL]anguage/)?Getting[_ ]Help( )*(\]|\||#)@ | includemaxlength = 0 | format = ,\n* %TITLE%,, | resultsheader = There are %TOTALPAGES% pages beginning with L-Z\n }}


So, to recap, the problem is this: We would like to find all pages that links to a given page (the target). The What links here wiki page does not work well with links adapted to the translation extension, i.e. links of the form [[Special:myLanguage/target page]], which means that almost none of our links would be found.

Instead we have to use DPL to find those pages, but even here we have to be careful. The obvious search using the linksto clause doesn't work either, probably for the same reason that What Links Here failed. The solution is to search the content of every page for the occurrence of a link to the target page. The problem here is, that DPL has to write the entire content of a page into the page where the search occurs, then search the text, then filter out the text again (that's the includemaxlength=0 part) and leave only links to the pages that contain a link to the target page.

For some reason, these kinds of query cause some confusion to DPL. Specifying no namespace should result in the main namespace beeing searched, but as we saw above, not every matching page in main was found, and oddly one User: page was found. Specifying both namespaces in one search seems to work better, but still misses a User: page. The solution seems to be one search for each namespace: that seems to find everything with one exception: The start page Welcome to KDE UserBase seems to be outside of namespaces, and is never found. Let's hope it is one of a kind.

This method of searching has been implemented as a template {{LinksTo|target page}} (which can easily be modified to include more namespaces in the search). Just to be clear: We are searching for the actual occurence of a link in the texts of pages. The search finds both Special:myLanguage links and old-style links. It tries to take every known variation into account (Special:, or special:, and mylanguage, Mylanguage, myLanguage, or MyLanguage). One variation we have to deal with by hand is, that spaces can be written either as a normal space chracter or as an underscore. Therefore we should either make to searches if the target name contains a space, or we could write [_ ] wherever a space occurs in the name, as in {{LinksTo|Getting[_ ]Help}}. Also note, that case is significant: {{LinksTo|getting[_ ]help}} yields

{{#dpl: | namespace = | nottitleregexp = .*((/[a-z][a-z](.|-..)?)|([ _][(][a-z][a-z](...)?[)]))$ | titleregexp = ^[A-J] | include = * | includematch = @\[\[([Ss]pecial\:[mM]y[lL]anguage/)?getting[_ ]help( )*(\]|\||#)@ | includemaxlength = 0 | format = ,\n* %TITLE%,, | resultsheader = There are %TOTALPAGES% pages beginning with A-J\n }} {{#dpl: | namespace = | nottitleregexp = .*((/[a-z][a-z](.|-..)?)|([ _][(][a-z][a-z](...)?[)]))$ | titleregexp = ^K[A-Z] | include = * | includematch = @\[\[([Ss]pecial\:[mM]y[lL]anguage/)?getting[_ ]help( )*(\]|\||#)@ | includemaxlength = 0 | format = ,\n* %TITLE%,, | resultsheader = There are %TOTALPAGES% pages beginning with KA-KZ\n }} {{#dpl: | namespace = | nottitleregexp = .*((/[a-z][a-z](.|-..)?)|([ _][(][a-z][a-z](...)?[)]))$ | titleregexp = ^K[a-ce-z] | include = * | includematch = @\[\[([Ss]pecial\:[mM]y[lL]anguage/)?getting[_ ]help( )*(\]|\||#)@ | includemaxlength = 0 | format = ,\n* %TITLE%,, | resultsheader = There are %TOTALPAGES% pages beginning with Ka-Kc or Ke-Kz\n }} {{#dpl: | namespace = | nottitleregexp = .*((/[a-z][a-z](.|-..)?)|([ _][(][a-z][a-z](...)?[)]))$ | titleregexp = ^Kd | include = * | includematch = @\[\[([Ss]pecial\:[mM]y[lL]anguage/)?getting[_ ]help( )*(\]|\||#)@ | includemaxlength = 0 | format = ,\n* %TITLE%,, | resultsheader = There are %TOTALPAGES% pages beginning with Kd\n }} {{#dpl: | namespace = | nottitleregexp = .*((/[a-z][a-z](.|-..)?)|([ _][(][a-z][a-z](...)?[)]))$ | titleregexp = ^[L-Z] | include = * | includematch = @\[\[([Ss]pecial\:[mM]y[lL]anguage/)?getting[_ ]help( )*(\]|\||#)@ | includemaxlength = 0 | format = ,\n* %TITLE%,, | resultsheader = There are %TOTALPAGES% pages beginning with L-Z\n }}


We just get two error messages (one for each search), since DPL gives an error message whenever a text search finds no match on any page.

The template

The code of the first half of the template is here:

{{#dpl:
| namespace = Main
| nottitleregexp = .*(/..(-..)?{{!}}_[(].*[)])$
| include = *
| includematch = @\[\[([Ss]pecial\:[mM]y[lL]anguage/)?{{{1}}}(\]{{!}}\{{!}}{{!}}#)@
| includemaxlength = 0
| format = ,\n* [[%PAGE%|%TITLE%]],,
}}

The second half is the same, except that | namespace = Main is replaced by | namespace = User.

I couldn't make normal DPL tags work in the template, but fortunately the {{#dpl parser function does work. To add more namespaces just add a copy of the first half of the template to then end, and in the copy replace | namespace = Main by | namespace = Whatever.

Testing the template

{{LinksTo|User:Claus[_ ]chr}}{{#dpl: | namespace = | nottitleregexp = .*((/[a-z][a-z](.|-..)?)|([ _][(][a-z][a-z](...)?[)]))$ | titleregexp = ^[A-J] | include = * | includematch = @\[\[([Ss]pecial\:[mM]y[lL]anguage/)?User:Claus[_ ]chr( )*(\]|\||#)@ | includemaxlength = 0 | format = ,\n* %TITLE%,, | resultsheader = There are %TOTALPAGES% pages beginning with A-J\n }} {{#dpl: | namespace = | nottitleregexp = .*((/[a-z][a-z](.|-..)?)|([ _][(][a-z][a-z](...)?[)]))$ | titleregexp = ^K[A-Z] | include = * | includematch = @\[\[([Ss]pecial\:[mM]y[lL]anguage/)?User:Claus[_ ]chr( )*(\]|\||#)@ | includemaxlength = 0 | format = ,\n* %TITLE%,, | resultsheader = There are %TOTALPAGES% pages beginning with KA-KZ\n }} {{#dpl: | namespace = | nottitleregexp = .*((/[a-z][a-z](.|-..)?)|([ _][(][a-z][a-z](...)?[)]))$ | titleregexp = ^K[a-ce-z] | include = * | includematch = @\[\[([Ss]pecial\:[mM]y[lL]anguage/)?User:Claus[_ ]chr( )*(\]|\||#)@ | includemaxlength = 0 | format = ,\n* %TITLE%,, | resultsheader = There are %TOTALPAGES% pages beginning with Ka-Kc or Ke-Kz\n }} {{#dpl: | namespace = | nottitleregexp = .*((/[a-z][a-z](.|-..)?)|([ _][(][a-z][a-z](...)?[)]))$ | titleregexp = ^Kd | include = * | includematch = @\[\[([Ss]pecial\:[mM]y[lL]anguage/)?User:Claus[_ ]chr( )*(\]|\||#)@ | includemaxlength = 0 | format = ,\n* %TITLE%,, | resultsheader = There are %TOTALPAGES% pages beginning with Kd\n }} {{#dpl: | namespace = | nottitleregexp = .*((/[a-z][a-z](.|-..)?)|([ _][(][a-z][a-z](...)?[)]))$ | titleregexp = ^[L-Z] | include = * | includematch = @\[\[([Ss]pecial\:[mM]y[lL]anguage/)?User:Claus[_ ]chr( )*(\]|\||#)@ | includemaxlength = 0 | format = ,\n* %TITLE%,, | resultsheader = There are %TOTALPAGES% pages beginning with L-Z\n }}


Problem nr. 1. We also find links to subpages of the target (doh!), but that should be simple to fix. (The error message just means that there were no linking pages in the Main namespace.)

{{LinksTo|Amarok}}{{#dpl: | namespace = | nottitleregexp = .*((/[a-z][a-z](.|-..)?)|([ _][(][a-z][a-z](...)?[)]))$ | titleregexp = ^[A-J] | include = * | includematch = @\[\[([Ss]pecial\:[mM]y[lL]anguage/)?Amarok( )*(\]|\||#)@ | includemaxlength = 0 | format = ,\n* %TITLE%,, | resultsheader = There are %TOTALPAGES% pages beginning with A-J\n }} {{#dpl: | namespace = | nottitleregexp = .*((/[a-z][a-z](.|-..)?)|([ _][(][a-z][a-z](...)?[)]))$ | titleregexp = ^K[A-Z] | include = * | includematch = @\[\[([Ss]pecial\:[mM]y[lL]anguage/)?Amarok( )*(\]|\||#)@ | includemaxlength = 0 | format = ,\n* %TITLE%,, | resultsheader = There are %TOTALPAGES% pages beginning with KA-KZ\n }} {{#dpl: | namespace = | nottitleregexp = .*((/[a-z][a-z](.|-..)?)|([ _][(][a-z][a-z](...)?[)]))$ | titleregexp = ^K[a-ce-z] | include = * | includematch = @\[\[([Ss]pecial\:[mM]y[lL]anguage/)?Amarok( )*(\]|\||#)@ | includemaxlength = 0 | format = ,\n* %TITLE%,, | resultsheader = There are %TOTALPAGES% pages beginning with Ka-Kc or Ke-Kz\n }} {{#dpl: | namespace = | nottitleregexp = .*((/[a-z][a-z](.|-..)?)|([ _][(][a-z][a-z](...)?[)]))$ | titleregexp = ^Kd | include = * | includematch = @\[\[([Ss]pecial\:[mM]y[lL]anguage/)?Amarok( )*(\]|\||#)@ | includemaxlength = 0 | format = ,\n* %TITLE%,, | resultsheader = There are %TOTALPAGES% pages beginning with Kd\n }} {{#dpl: | namespace = | nottitleregexp = .*((/[a-z][a-z](.|-..)?)|([ _][(][a-z][a-z](...)?[)]))$ | titleregexp = ^[L-Z] | include = * | includematch = @\[\[([Ss]pecial\:[mM]y[lL]anguage/)?Amarok( )*(\]|\||#)@ | includemaxlength = 0 | format = ,\n* %TITLE%,, | resultsheader = There are %TOTALPAGES% pages beginning with L-Z\n }}


Yes! There must be thousands of pages linking to some subpage of Amarok. They are obviously not found. Now can I find this link to Talk:Translation Workflow?

{{TestLinksTo|Talk:Translation[_ ]Workflow}}{{#dpl: | namespace = User | nottitleregexp = .*(/..(-..)?|_[(].*[)])$ | redirects = include | include = * | includematch = @\[\[([Ss]pecial\:[mM]y[lL]anguage/)?Talk:Translation[_ ]Workflow(\]|\||#)@ | includemaxlength = 0 | format = ,\n* %TITLE%,, }}


This page is not found, but it is found if the same query is performed in another page! I guess, I should have expected that, given the way these queries are performed. Otherwise the page would have to include itself, which could lead to problems.

Kopete Subpages in 3 columns

<DPL>
  titlematch = Kopete/%
  notnamespace = Translations
  columns = 3
  format = ,\n* [[%PAGE%|%TITLE%]],,
</DPL>

Akonadi Subpages in Danish

<DPL>
  titlematch = Akonadi%/da
  notnamespace = Translations
  format = ,\n* [[%PAGE%|%TITLE%]],,
</DPL>

Archived pages

<DPL>
  titlematch = %
  namespace = Archive
  columns = 2
  format = ,\n* [[%PAGE%|%TITLE%]],,
  resultsheader = There are %TOTALPAGES% pages in the Archive namespace. These are:\n
</DPL>

NoIndexed pages

<DPL>
  titlematch = %
  category = Noindexed_pages
  columns = 2
  format = ,\n* [[%PAGE%|%TITLE%]],,
  resultsheader = There are %TOTALPAGES% pages in the Archive namespace. These are:\n
</DPL>

Ignoring Deleted Pages

"As for DPL. If you hit a page with ?action=purge attached to the URL (i.e. http://en.wikinews.org/wiki/Template:Latest_news?action=purge ), it will dump all the removed pages."

Remaining old-style translations

<DPL>
  titlematch = %_(%)
  notcategory = Template
  notnamespace = Thread
  notnamespace = Summary
  columns = 2
  format = ,\n* [[%PAGE%|%TITLE%]],,
  resultsheader = There are %TOTALPAGES% pages (partly) remaining in old-style translations. These are:\n
</DPL>

Pages with old i18n bar

<DPL>
  titlematch = %
  namespace = Main
  uses = Template:I18n/Language Navigation Bar
  columns = 3
  format = ,\n* [[%PAGE%|%TITLE%]],,
  resultsheader = There are %TOTALPAGES% pages that still display the old i18n language bar\n
</DPL>

Pages with old i18n bar but w/o old-way-translated ones

<DPL>
  nottitlematch = %_(%)
  namespace = Main
  uses = Template:I18n/Language Navigation Bar
  columns = 3
  format = ,\n* [[%PAGE%|%TITLE%]],,
  resultsheader = There are %TOTALPAGES% relevant pages that still display the old i18n language bar\n
</DPL>

Pages not updated since 1st July 2010

<DPL>
  namespace = Main
  lastrevisionbefore = 201007010000
  columns = 2
  ordermethod=lastedit
  format = ,\n* (%DATE%) [[%PAGE%|%TITLE%]],,
  resultsheader = There are %TOTALPAGES% pages without recent updates\n
</DPL>

Listing Non-Translation Pages

<DPL>
  nottitlematch = %/__|%/zh-%|%(%)
  titlematch = Amarok%
  namespace = Main
  columns = 1
  format = ,\n* [[%PAGE%|%TITLE%]],,
  resultsheader = There are %TOTALPAGES% Amarok pages, not counting translations\n
</DPL>

List all pages in a specific namespace

<DPL>
  nottitlematch = %/__|%/zh-%|%pt-%|%(%)
  namespace = MediaWiki
  columns = 3
  format = ,\n* [[%PAGE%|%TITLE%]],,
  resultsheader = These %TOTALPAGES% pages are in the Mediawiki namespace\n
</DPL>

To count translated pages in a specific language:

<DPL>
  titlematch = %/en
  notnamespace = Translations
  columns = 3
  format = ,\n* [[%PAGE%|%TITLE%]],,
  resultsheader = There are %TOTALPAGES% pages (partly) translated to English. These are:\n
</DPL>

<DPL>

 titlematch = %/en
 notnamespace = Translations
 columns = 3
 format = ,\n* %TITLE%,,
 resultsheader = There are %TOTALPAGES% pages (partly) translated to English. These are:\n

</DPL>


Content is available under Creative Commons License SA 4.0 unless otherwise noted.
-->