No announcement yet.

Why your site map may be damaging your rankings on Google

  • Filter
  • Time
  • Show
Clear All
new posts

  • Why your site map may be damaging your rankings on Google

    While reviewing Google webmaster tools we found a report indicating that we had 203 line items (503 URLs) of duplicate content. This is not a report that jumps out at you.

    It can be found under Diagnostics->HTML suggestions in Google webmaster tools. The report is named Duplicate title tags”. On the first line of our report there were to URLs that were showing as duplicate

    /Rubber-Perches_c_174-285.html /Rubber-Perches_c_174.html

    Support’s response was:

    The first URL doesn't appear to load a category page any longer, but the fact that Google has it indexed means that there was once 280+ pages of products in the Rubber Perches category, or there is an invalid link on an external website because the "-285" appended to the URL is the page number.

    This answer has two parts and they are both wrong. It was never 280 pages of products in the rubber perches category because it's never been 280 rubber perches available in the history of our industry. The reference to link on an external website is wrong because this report is for duplicate content ON our website it has nothing to do with external links.

    So throughout the course of the dozen or so exchanges support never cognized the “duplicate content” issue, although I sent the actual report down loaded from google

    The Google webmaster report also contained (duplicate content) links that look like this

    Bird-Parrot Travel Carriers
    Wood Bird Toys
    African Grey Supplies

    The recommendation from support to fix this problem was as follows:

    After I sent my last reply I thought of another solution you could try using to solve this problem. You could add these URL's to your robots.txt file like this (using the same category as an example):

    Disallow: /Rubber-Perches_c_174-*.html

    This will prevent search engines from indexing any of the numerous permutations of this URL (i.e. page numbers and sort method combinations would be numerous). The * wildcard should handle that while still allowing Rubber-Perches_c_174.html to be indexed. This would be less time consuming than entering 301 redirects for all of the old URL's.

    Thank you for choosing 3dCart! Please let us know if there is anything else we can do for you.

    This set of instructions is wrong and will cost you money if you follow it. The reason is that besides the examples above, as I examined all of the duplicate title URLs I realized that almost 50 in the list were legitimate duplicate content URLs stemming from the cloning of listings but failing to change the title tags. For example

    Samsula Stainless Steel Parrot Cage - Freedom Cages | Wekiwa Stainless Steel Parrot Cage - Freedom Cages | Ocala Stainless Steel Parrot Cage - Freedom Cages

    The second two listings were cloned from the first but the title tags were never changed Thus if if I had followed the directions above I would have been blocking the Google bot from spidering these listings.

    I made the manual corrections to the listings with duplicate title content which fixed that problem. The tech that I was communicating with still never gave me an answer as to where all of these URLs were coming from

    One answer that I received was as follows:

    I wouldn't necessarily say it's "normal" to have this issue occurring, but it's certainly possible with one of the scenarios I outlined previously. In the future, when our canonical link tags are updated to remove pagination in the URL, it will take care of this problem automatically. Until then, you would just need to monitor your crawl reports, as you have been, to see if there is an external website linking to these bad URL's and request their removal from the Google index and add the 301 redirects.

    Turns out his answer was not the problem. I felt I needed more clarity to this problem because the answers I was getting were incomplete so I asked to escalate the ticket and the response was

    There is no need to escalate this ticket, as the solution to the problem would not be any different. I keep referring to other websites having links to these URL's as *one* of the possibilities for where Google is getting this URL. Google keeps indexing this URL because it either existed at one time, OR it is linked from another web page. Unless you follow one of my suggestions, Google will continue to index this page, because - as you've already noticed - the URL returns a valid page (albeit with no products).

    Once again a reference to external URLs which had nothing to do with the problem.

    Contrary to support's feeling that “as the solution to the problem would not be any different” I felt that a collaborative effort would be in our best interest and took the better part of the day to actually get a response from level II support. Had I received the answer I got from level II early on there would've been no need to post 14 replies the original ticket

    The response I got from level II was as follows:

    Actually, in that case it would be best to just clear the sitemap cache and resubmit the sitemap to Google so they have an updated list of your URL's. Information about clearing sitemap cache can be found here:

    What I was referring to is related to duplicate content errors that occur because categories are paginated, which creates more than one category page, with similar content, but different canonical URLs.

    This was my Ah-Ha moment

    Included in the list of duplicate content on Google were the following URLs

    Sample Category | Sample Category

    These categories come with your store out-of-the-box. Which means that unless you clear your site map cache and resubmit the site map to Google the site map hangs on to every category that you've ever deleted and every product you rename thereby creating duplicate content that you will be penalized for by Google.

    We have had this store for nine months and have generally been pleased overall, but clearing the site map cache would have been good information from the beginning and will help with ours as well as your Google rankings

  • #2
    Thank you for sharing this - I have had a hard time with figuring out where URL's showing up in Webmaster Tools have come from. This explains it.
    TC Life Safety
    TC Wireless


    • #3
      Duplicate content

      I knew I wasn't alone :-)


      • #4
        I wouldnt worry so much about it. I am currently smashing rankings with my site for various juicy competitive keywords


        • #5
          WCP, thank you so much for your detailed and attentive explanation.

          Last edited by Brtp4; 03-08-2012, 02:15 PM.


          • #6
            One thing to add / ask: can anyone explain why the software would generate multiple URLs for a manufacturer, when nothing on that mfg has changed?



            • #7
              The links for the manufacturer make sense.

              is manufacturer 15, sorted by 0 (name), page 1
              is manufacturer 15, sorted by 0 (name), page 2

              However I think you will also find manufacturer pages 15-1 and 15-2 which would correspond to these same pages.

              If you generate a site map using a crawler, you will find all these variations. Each category page will be listed with and without the sorting method number

              category-x-0-1 and category-x-1 for category x page 1

              Also, page one would show up with the -1 and without it
              so category-x , and category-x-1

              The presentation of pages to crawlers is just a mess.


              • #8
                Yes, and the last page for the manufacturers is always extra / blank / without products.

                Some sort of problem with the custom page names on blog pages too.

                Last edited by Brtp4; 03-10-2012, 02:01 PM.


                • #9
                  The link to how to clear the sitemap cache doesn't work.


                  • #10
                    dupe content

                    I have 3 issues (for the sake of time) with their answer

                    1) We change cat titles and product titles all the time in an effort to increase conversion - ALL our categories have sorting options thus "This will only block search engines from indexing category URLs that have pagination or sorting options following it" would block all our categories

                    2) whether it be robots.txt or site map cache clearing - they are adding unnecessary layers of website development because of 1)

                    They don't talk about these issues anywhere when you open the store so it's clearly a bug in the site architecture.

                    3) If the "site map clear cache solution" was effective I wouldn't still be seeing Sample Category | Sample Category
                    these are the cats that come with the store that I deleted in March 2011.

                    I can show you part of the thread where "the paginated number indicates the quantity of products in that category at certain point in time" That's when i knew they were making this up as they went


                    • #11
                      Yes, I agree completely.

                      Last edited by Brtp4; 03-08-2012, 02:17 PM.


                      • #12
                        dupe content

                        I'm have a meeting with our SEO company soon - I'll share the feedback - it's high on my list


                        • #13
                          Any news on this?



                          • #14
                            clearing site map

                            support is clueless

                            Hello Mitch,

                            Resubmitting the sitemap would only help in the case where you changed an existing category/product/page name, and it would not happen right away.

                            Jeff M
                            3dcart Technical Support

                            The list continues to grow

                            I don't know what that answer means - I just added 50 no follows - we'll see what happens.


                            • #15
                              I clear my sitemap cache all the time, it doesn't seem to help...

                              Well it's 2012 and I just noticed I am having the same issue as what is listed on this thread. I clear the sitemap cache every few days and I still have over 200 duplicate content warnings and over 200 meta data duplicate warnings. So far I have not got anything from support about it, still waiting. Anyone ever find out about this? I really feel it is hurting my dog boutique online rankings but that is just a guess.