Possibly some progess on Roster item parsing

Sync Blizzards' Armory data with WoWRoster (addon depreciated no longer works see ApiSync)

Moderators: Ulminia, poetter

Possibly some progess on Roster item parsing

Postby zanix » Tue Dec 18, 2007 1:01 pm

I have been looking at the tooltips lately and I noticed something
Viewing them in the DB, they look just fine, but if I pass the tooltip through htmlentities(), utf8_encode(), and utf8_decode()... I get some strange results

These are split, line by line, with each line wrapped in pipes |

CP tooltip example
Same results with with htmlentities(), utf8_encode(), and utf8_decode()
Shadowcast Tunic =>
|Soulbound|
|Chest Cloth|
|122 Armor|
|+21 Stamina|
|+15 Intellect|
|Durability 80 / 80|
|Equip: Improves spell critical strike rating by 14.|
|Equip: Increases damage and healing done by magical spells and effects by up to 44.|


ArmorySync tooltip example
htmlentities()
Feralfen Beastmaster's Hauberk =>
|Soulbound|
|Chest Mail|
|528Â Armor|
|+21Â Agility|
|+31Â Stamina|
|+20Â Intellect|
||
|Durability:Â 76 / 100|
|Equip:Â Increases attack power by 42.|
|Source:Â Quest Reward|


ArmorySync tooltip example
utf8_encode()
Feralfen Beastmaster's Hauberk =>
|Soulbound|
|Chest Mail|
|528Â Armor|
|+21Â Agility|
|+31Â Stamina|
|+20Â Intellect|
||
|Durability:Â 76 / 100|
|Equip:Â Increases attack power by 42.|
|Source:Â Quest Reward|


ArmorySync tooltip example
utf8_decode()
Feralfen Beastmaster's Hauberk =>
|Soulbound|
|Chest Mail|
|528�Armor|
|+21�Agility|
|+31�Stamina|
|+20�Intellect|
||
|Durability:�76 / 100|
|Equip:�Increases attack power by 42.|
|Source:�Quest Reward|


Notice the  (and � on utf8_decode)?

Now, this doesn't show up unless the tooltip passes through htmlentities(), utf8_encode(), or utf8_decode()

We are getting an unknown, hidden character that is throwing off the parser
I think if we parse out this character somehow, the parser will work much better
Read the Forum Rules, the WiKi, and Search before posting!
WoWRoster v2.1 - SigGen v0.3.3.523 - WoWRosterDF
User avatar
zanix
Admin
Admin
WoWRoster.net Dev Team
WoWRoster.net Dev Team
UA/UU Developer
UA/UU Developer
 
Posts: 5546
Joined: Mon Jul 03, 2006 8:29 am
Location: Idaho Falls, Idaho
Realm: Doomhammer (PvE) - US

Possibly some progess on Roster item parsing

Postby zanix » Tue Dec 18, 2007 2:11 pm

Ok, I think I just might have solved it

Open inc/armorysync.class.php
Find Line 1237ish
Code: Select all
            $content = str_replace("\n", '', $content );
            $content = str_replace('<span class="tooltipRight">', "\t", $content );
            $content = str_replace('<br/>', "%__BRTAG%", $content );
            $content = str_replace('<br>', "%__BRTAG%", $content );


Replace with
Code: Select all
            $content = str_replace("\n", '', $content );
            $content = str_replace("\t", '', $content );
            $content = str_replace('<span class="tooltipRight">', "\t", $content );
            $content = str_replace('<br />', "%__BRTAG%", $content );
            $content = str_replace('<br>', "%__BRTAG%", $content );
            $content = str_replace('& nbsp;', ' ', $content );

(Make sure you take the space out of & nbsp;)

Now, reload all your characters and watch most of the parsing errors go away


As for the rest of them
Then you need to replace some locale strings as well
Any localizers out there? These might need to be adjusted for each locale

enUS
Code: Select all
$lang['tooltip_preg_durability']='/Durability(|:) (\d+) \/ (\d+)/';
$lang['tooltip_garbage']='<Shift Right Click to Socket>|<Right Click to Read>|Duration|Cooldown remaining|<Right Click to Open>|Source:|Boss:|Drop Rate:';


deDE
Code: Select all
$lang['tooltip_preg_durability']='/Haltbarkeit(|:) (\d+) \/ (\d+)/';
$lang['tooltip_garbage']='<Zum Sockeln Shift-Rechtsklick>|<Zum Lesen rechtsklicken>|Duration|Verbleibende Abklingzeit|<Right Click to Open>|Source:|Boss:|Drop Rate:';


esES
Code: Select all
$lang['tooltip_preg_durability']='/Durabilidad(|:) (\d+) \/ (\d+)/';
$lang['tooltip_garbage']='<Mayús clic derecho para insertar>|<Clic derecho para leer>|Duración|<Clic derecho para abrir>|<Right Click to Open>|Source:|Boss:|Drop Rate:';


frFR
Code: Select all
$lang['tooltip_preg_durability']='/Durabilité(|:) (\d+) \/ (\d+)/';
$lang['tooltip_garbage']='Maj clic-droit pour sertir|<Right Click to Read>|Duration|Temps de recharge|<Right Click to Open>|Source:|Boss:|Drop Rate:';



Then in lib/item.php
Find
Code: Select all
            $tt['Attributes']['Durability']['Line']= $matches[0];
            $tt['Attributes']['Durability']['Current'] = $matches[1];
            $tt['Attributes']['Durability']['Max'] = $matches[2];


Replace with
Code: Select all
            $tt['Attributes']['Durability']['Line']= $matches[0];
            $tt['Attributes']['Durability']['Current'] = $matches[2];
            $tt['Attributes']['Durability']['Max'] = $matches[3];
Read the Forum Rules, the WiKi, and Search before posting!
WoWRoster v2.1 - SigGen v0.3.3.523 - WoWRosterDF
User avatar
zanix
Admin
Admin
WoWRoster.net Dev Team
WoWRoster.net Dev Team
UA/UU Developer
UA/UU Developer
 
Posts: 5546
Joined: Mon Jul 03, 2006 8:29 am
Location: Idaho Falls, Idaho
Realm: Doomhammer (PvE) - US

Possibly some progess on Roster item parsing

Postby Diska » Tue Dec 18, 2007 7:17 pm

Zanix, you are the bomb. Just updated to the latest SVN and applied your changes to armorysync.class.php and now the items from Armory get fully parsed!
So far I tested only 1 member, but after a sync *all* errors disappeared, not a single warning or error left.

edit:

Ah, cheered to early, some weird things seem to happen now (but it's still progress none the less). First of all the gems often get colored as being enchants, but there are other problems:

Check out this char: http://blazeofglory.game-host.org/roste ... info&a=c:5
Compare to Armory: http://eu.wowarmory.com/character-sheet ... =An%C3%ABr
In particular take a look at the chestpiece and the boots. The base stamina those items are supposed to have disappeared.

More oddities (gems related):
http://blazeofglory.game-host.org/roste ... nfo&a=c:26
http://eu.wowarmory.com/character-sheet ... Cornholius
The headpiece and pants have 3x +9 spelldamage gems, only 1 gets shown in the roster

Still found 1 parsing error:
http://blazeofglory.game-host.org/roste ... info&a=c:6
It seems to trip over the +2% threat enchant
Last edited by Diska on Tue Dec 18, 2007 8:55 pm, edited 2 times in total.
User avatar
Diska
Roster AddOn Dev
Roster AddOn Dev
 
Posts: 179
Joined: Tue Jul 04, 2006 2:05 pm

Possibly some progess on Roster item parsing

Postby zanix » Wed Dec 19, 2007 1:47 am

Indeed it's a good start
At least most of the errors go away, and that was my goal when I found those strange characters
I started trying to figure out why things like '42 Stamina', '500 Armor', and 'Durability: 25 / 30' were not working. It seemed the reg_ex matched the string so I decided to put it through htmlentities()
Then I figured out that & nbsp; was being converted to the strange character, I also removed all tabs as well since the only tab we need was already labeled with '<span class="tooltipRight">'

Now we can actually start fixing the other problems
Read the Forum Rules, the WiKi, and Search before posting!
WoWRoster v2.1 - SigGen v0.3.3.523 - WoWRosterDF
User avatar
zanix
Admin
Admin
WoWRoster.net Dev Team
WoWRoster.net Dev Team
UA/UU Developer
UA/UU Developer
 
Posts: 5546
Joined: Mon Jul 03, 2006 8:29 am
Location: Idaho Falls, Idaho
Realm: Doomhammer (PvE) - US

Possibly some progess on Roster item parsing

Postby Diska » Wed Dec 19, 2007 7:35 am

how about the code ds made? I was glancing at item.lib earlier today and saw a lot of commented code for a "web" method of parsing. I also recall that ds said he was "almost" done with the parsing for armory data just before he disappeared. Anything that might be of use in there?
User avatar
Diska
Roster AddOn Dev
Roster AddOn Dev
 
Posts: 179
Joined: Tue Jul 04, 2006 2:05 pm

Possibly some progess on Roster item parsing

Postby PleegWat » Wed Dec 19, 2007 7:41 am

I personally think that's a bad idea. The tooltip column is complicated enough without having two separate syntaxes that can be in it.
I <3 /bin/bash
User avatar
PleegWat
WoWRoster.net Dev Team
WoWRoster.net Dev Team
 
Posts: 1636
Joined: Tue Jul 04, 2006 1:43 pm

Possibly some progess on Roster item parsing

Postby poetter » Wed Dec 19, 2007 12:57 pm

Nice shot, man. Gz

Will apply your changes to the next commit.
Image
User avatar
poetter
Roster AddOn Dev
Roster AddOn Dev
 
Posts: 462
Joined: Sat Jun 30, 2007 9:41 pm
Location: Germany/Hamburg

Possibly some progess on Roster item parsing

Postby poetter » Wed Dec 19, 2007 1:14 pm

One last thing is needed to get it really all:

Code: Select all
         $content = preg_replace('/\s\s+/', '', $content );
Image
User avatar
poetter
Roster AddOn Dev
Roster AddOn Dev
 
Posts: 462
Joined: Sat Jun 30, 2007 9:41 pm
Location: Germany/Hamburg

Re: Possibly some progess on Roster item parsing

Postby poetter » Wed Dec 19, 2007 1:46 pm

And here some code for item.lib.php to get source, boss and droprate working:

Code: Select all
--- C:/Dokumente und Einstellungen/Daniel/Lokale Einstellungen/Temp/item.php-revBASE.svn000.tmp.php   Wed Dec 19 05:42:19 2007
+++ D:/xampp/roster.com/htdocs/lib/item.php   Wed Dec 19 05:41:38 2007
@@ -516,6 +516,27 @@
       return $html;
    }
 
+   function _getBoss()
+   {
+      $tmp = explode ( ':', $this->attributes['Boss'] );
+      $html = '<span style="color:#ffd800;">' . $tmp[0] . ':</span><span style="color:#ffffff;">' . $tmp[1] . '</span><br />';
+      return $html;
+   }
+
+   function _getSource()
+   {
+      $tmp = explode ( ':', $this->attributes['Source'] );
+      $html = '<br /><span style="color:#ffd800;">' . $tmp[0] . ':</span><span style="color:#ffffff;">' . $tmp[1] . '</span><br />';
+      return $html;
+   }
+
+   function _getDropRate()
+   {
+      $tmp = explode ( ':', $this->attributes['DropRate'] );
+      $html = '<span style="color:#ffd800;">' . $tmp[0] . ':</span><span style="color:#ffffff;">' . $tmp[1] . '</span><br />';
+      return $html;
+   }
+
    /**
     * Reconstructs item's tooltip from parsed information.
     * All HTML Styling is done in the private _getXX() methods
@@ -631,6 +652,18 @@
          {
             $html_tt .= $this->_getItemNote();
          }
+         if( isset($this->attributes['Source']) )
+         {
+            $html_tt .= $this->_getSource();
+         }
+         if( isset($this->attributes['Boss']) )
+         {
+            $html_tt .= $this->_getBoss();
+         }
+         if( isset($this->attributes['DropRate']) )
+         {
+            $html_tt .= $this->_getDropRate();
+         }
 
          if( ($this->DEBUG && $this->isParseError) || $this->DEBUG == 2 )
          {
@@ -1088,6 +1121,18 @@
             $tt['Attributes']['Set']['ArmorSet']['Name'] = $matches[1];
             $this->isSetPiece = true;
             $setpiece = 1;
+         }
+         elseif( ereg('^'. $roster->locale->wordings[$locale]['tooltip_source'], $line ) )
+         {
+            $tt['Attributes']['Source'] = $line;
+         }
+         elseif( ereg('^'. $roster->locale->wordings[$locale]['tooltip_boss'], $line ) )
+         {
+            $tt['Attributes']['Boss'] = $line;
+         }
+         elseif( ereg('^'. $roster->locale->wordings[$locale]['tooltip_droprate'], $line ) )
+         {
+            $tt['Attributes']['DropRate'] = $line;
          }
          elseif( $setpiece )
          {


And add these lines in your language to your locale file:

Code: Select all
$lang['tooltip_source']='Quelle';
$lang['tooltip_boss']='Boss';
$lang['tooltip_droprate']='Droprate';
Image
User avatar
poetter
Roster AddOn Dev
Roster AddOn Dev
 
Posts: 462
Joined: Sat Jun 30, 2007 9:41 pm
Location: Germany/Hamburg

Possibly some progess on Roster item parsing

Postby zanix » Wed Dec 19, 2007 3:06 pm

Cool, I'll add this
Read the Forum Rules, the WiKi, and Search before posting!
WoWRoster v2.1 - SigGen v0.3.3.523 - WoWRosterDF
User avatar
zanix
Admin
Admin
WoWRoster.net Dev Team
WoWRoster.net Dev Team
UA/UU Developer
UA/UU Developer
 
Posts: 5546
Joined: Mon Jul 03, 2006 8:29 am
Location: Idaho Falls, Idaho
Realm: Doomhammer (PvE) - US

Re: Possibly some progess on Roster item parsing

Postby poetter » Thu Dec 20, 2007 12:41 am

Nice, here are the german locales:

Code: Select all
--- C:/Dokumente und Einstellungen/Daniel/Lokale Einstellungen/Temp/deDE.php-revBASE.svn000.tmp.php   Wed Dec 19 16:39:42 2007
+++ D:/xampp/roster.com/htdocs/localization/deDE.php   Wed Dec 19 16:39:24 2007
@@ -603,9 +603,9 @@
 $lang['tooltip_preg_emptysocket']='/(Meta|Roter|Gelber|Blauer)(?:.?sockel)/i';
 $lang['tooltip_preg_reinforcedarmor']='/(Verstärkt \(\+\d+ Rüstung\))/';
 $lang['tooltip_preg_tempenchants']='/(.+\s\(\d+\s(min|sek)\.?\))\n/i';
-$lang['tooltip_source']='Source';
+$lang['tooltip_source']='Quelle';
 $lang['tooltip_boss']='Boss';
-$lang['tooltip_droprate']='Drop Rate';
+$lang['tooltip_droprate']='Droprate';
 
 $lang['tooltip_chance_hit']='Trefferchance'; // needs to find 'chance on|to hit:'
 $lang['tooltip_reg_requires']='Benötigt';
Image
User avatar
poetter
Roster AddOn Dev
Roster AddOn Dev
 
Posts: 462
Joined: Sat Jun 30, 2007 9:41 pm
Location: Germany/Hamburg

Possibly some progess on Roster item parsing

Postby zanix » Thu Dec 20, 2007 4:25 am

Done
Read the Forum Rules, the WiKi, and Search before posting!
WoWRoster v2.1 - SigGen v0.3.3.523 - WoWRosterDF
User avatar
zanix
Admin
Admin
WoWRoster.net Dev Team
WoWRoster.net Dev Team
UA/UU Developer
UA/UU Developer
 
Posts: 5546
Joined: Mon Jul 03, 2006 8:29 am
Location: Idaho Falls, Idaho
Realm: Doomhammer (PvE) - US

Re: Possibly some progess on Roster item parsing

Postby poetter » Wed Dec 26, 2007 6:38 am

When i fixed characters sex in french i recognized that french tooltips are still not parsed full.

So i decided to give it a look and what should i say. Is it possible that french armory tooltips are totally diffrent from cp ones?

Look at this:

Armory <=> CP

"390 Armure" <=> "Armure 390"
"Durabilité : 135 / 135" <=> "Durabilité: 135 / 135"
"83 Blocage" <=> "Bloquer 83"

So my question know is. What to do? Can we keep the regex flexible to handle CP and AS, or do i have AS to be CP compatible?


Another thing is an another invisible character in the regex for socketboni frFR.php. Heres a patch:
Code: Select all
--- C:/Dokumente und Einstellungen/Daniel/Lokale Einstellungen/Temp/frFR.php-revBASE.svn000.tmp.php   Tue Dec 25 22:33:32 2007
+++ D:/xampp/roster.com/htdocs/localization/frFR.php   Tue Dec 25 22:33:16 2007
@@ -595,7 +595,7 @@
 $lang['tooltip_preg_durability']='/Durabilité(|:) (\d+) \/ (\d+)/';
 $lang['tooltip_preg_madeby']='/\<Artisan.+ (.+)\>/';  // this is the text that shows who crafted the item.
 $lang['tooltip_preg_bags']='/Conteneur (\d+) emplacements/';  // text for bags, ie '16 slot bag'
-$lang['tooltip_preg_socketbonus']='/Bonus de sertissage : (.+)\n/';
+$lang['tooltip_preg_socketbonus']='/Bonus de sertissage : (.+)\n/';
 $lang['tooltip_preg_classes']='/^(Classes..:.)(.+)$/'; // text for class restricted items
 $lang['tooltip_preg_races']='/^(Races..:.)(.+)$/'; // text for race restricted items
 $lang['tooltip_preg_charges']='/(\d+) Charges/i'; // text for items with charges
@@ -605,7 +605,7 @@
 $lang['tooltip_preg_tempenchants']='/(.+\s\(\d+\s(min|sec)\))\n/';
 $lang['tooltip_source']='Source';
 $lang['tooltip_boss']='Boss';
-$lang['tooltip_droprate']='Drop Rate';
+$lang['tooltip_droprate']='Fréquence de butin';
 
 $lang['tooltip_chance_hit']='Chances quand vous touchez...'; // needs to find 'chance on|to hit:'
 $lang['tooltip_reg_requires']='Niveau|requis|Requiert'; // À une main
Image
User avatar
poetter
Roster AddOn Dev
Roster AddOn Dev
 
Posts: 462
Joined: Sat Jun 30, 2007 9:41 pm
Location: Germany/Hamburg

Re: Possibly some progess on Roster item parsing

Postby tuigii » Wed Dec 26, 2007 11:53 am

poetter wrote:Look at this:
Armory <=> CP
"390 Armure" <=> "Armure 390"
"Durabilité : 135 / 135" <=> "Durabilité: 135 / 135"
"83 Blocage" <=> "Bloquer 83"


I know.

You said it in words - thisis the image showing it.
It prints the lines the aren't parsed fully, with:
Code: Select all
 $unparsed[]=$line; echo $line."  "

on line 1125 in item.lib

Another example is here.

About the preg_match concerning Armure
/Armure.+ (\d+)|(\d+) Armure/
or something like that ?
It seems to work for me, but :
In the case of CP [Armure +15] then $matches[1] returns 15
In the case of AS [15 Armure] then $matches[2] returns 15 - $matches[1] remains empty - oh oh.
Is there a pregmatch pattern expert out there :?: - otherwise the code has to be changed 'just for French roster's sake'.

Btw : thisguy again : have a look at its gun : the word Lunette (Glasses) is making troubles here.


This (Durability ):
'Durabilit..(.:|) (\d+) \/ (\d+)/'
works well for CP and AS for French rosters.

Same thing for this :
$lang['tooltip_preg_socketbonus']='/Bonus de sertissage..:.(.+)\n/';

Blocage - Bloquer
With this :
$lang['tooltip_preg_block']='/(?:(Bloquer).+?(\d+))|((\d+).+?(Blocage))/i'
scans both CP (still ok) and AS also (Blocage is found) - but better have this tested - $matches[1] and $matches[2] aren't filled in.
CP scans


One last example,
Look here : http://eu.wowarmory.com/character-sheet ... &n=Chassax and take a close look at it tooltip of his helmet (Camail de gardien de la terre) - he has a puprle gem.
Now, look here : http://www.les-potes-ages.fr/roster/ind ... nfo&a=c:85 the same guy, in the roster).
Where is the purple gem ? It isn't in the database (tooltip) :!: :?:
Import error ?
Last edited by tuigii on Tue Jan 01, 2008 4:11 am, edited 12 times in total.
User avatar
tuigii
WR.net Master
WR.net Master
 
Posts: 891
Joined: Wed Dec 27, 2006 12:57 pm
Location: Somewhere in the South Ouest of France


Return to ArmorySync - Depreciated

Who is online

Users browsing this forum: No registered users and 1 guest

cron