Baidu has been criticized by international SEO community for its slow crawling and indexing websites that not hosted in China. Best practice like Sitemap.xml submission to Baidu Webmaster Tools sometimes doesn’t help much, particularly for those websites with more than 10,000 URLs. Here in below we introduce an alternative solution called API submission to improve the situation.
What benefits will you get after using the API submission?
Speedup new URLs crawling and indexation: shorten the time for Baidu crawlers to discover new links on your site and new pages can be indexed by Baidu for the first time.
Protecting content originality: API Submission can quickly notify Baidu that the website has produced the latest original content, so that Baidu can index content before it is got plagiarized.
How to use API submission?
Go to your Baidu Webmaster Tools and select Ordinary Collection on the sidebar. After clicking the API submission button, you will see the token of the interface calling address. The token is a string composed of 16 English numbers. Here is an example:
Options of API submission:
1) Curl submission
Write the URL data to be submitted into a local file, such as urls.txt. Each URL occupies a line, and then call the curl command:
curl -H ‘Content-Type:text/plain’ –data-binary @urls.txt
“http://data.zz.baidu.com/urls?site=www.example.com&token=edk7yc4rEZP9pDQD”
If you are using PHP, Python, Java, etc., you may refer to this process to submission structured data.
2)Post submission
POST /urls?site=www.58.com&token=edk7ychrEZP9pDQD HTTP/1.1
User-Agent: curl/7.12.1
Host: data.zz.baidu.com
Content-Length: 83
http://www.example.com/1.html
http://www.example.com/2.html
3) PHP submission
$urls = array(
‘http://www.example.com/1.html’,
‘http://www.example.com/2.html’,
);
$api = ‘http://data.zz.baidu.com/urls
site=www.58.com&token=edk7ychrEZP9pDQD’;
$ch = curl_init();
$options = array(
CURLOPT_URL => $api,
CURLOPT_POST => true,
CURLOPT_RETURNTRANSFER => true,
CURLOPT_POSTFIELDS => implode(“\n”, $urls),
CURLOPT_HTTPHEADER => array(‘Content-Type: text/plain’),
);
curl_setopt_array($ch, $options);
$result = curl_exec($ch);
echo $result;
4) Ruby submission
require ‘net/http’
urls = [‘http://www.example.com/1.html’, ‘http://www.example.com/2.html’]
uri = URI.parse(‘http://data.zz.baidu.com/urls
site=www.xxx.com&token=eTk7ychrWZP1pDQD’)
req = Net::HTTP::Post.new(uri.request_uri)
req.body = urls.join(“\n”)
req.content_type = ‘text/plain’
res = Net::HTTP.start(uri.hostname, uri.port) { |http| http.request(req) }
puts res.body
How to check the feedback of API submission?
You can judge whether the data is submitted successfully by the status code and fields returned after submission
1. The status code is 200, indicating that the submission is successful. The following fields may be returned:
Field | Required or Not | Parameter Type | Description |
success | Required | int | The number of URLs successfully submitted |
remain | Required | int | The number of unsuitable URLs remaining in the day |
not_same_site | Not | array | List of unprocessed urls because they are not urls of this site |
not_valid | Not | array | list of invalid urls |
The example of a successful return:
{
“remain”:4999998,
“success”:2,
“not_same_site”:[],
“not_valid”:[]
}
2. The status code is 4XX or 500, indicating that the submission failed. The returned fields are:
Field | Required or Not | Parameter Type | Description |
error | Required | int | Error code, same as status code |
message | Required | string | Error description |
3. Meanings of common failed submission returns:
error | message | Meaning |
400 | site error | The site is not verified on the webmaster platform. |
400 | empty content | Post content is empty. |
400 | only 2000 urls are allowed once | You can only submit up to 2000 links once a time. |
400 | over quota | The submission exceeds daily quota, which is invalid. |
401 | token is not valid | Token error |
404 | not found | The interface address is incorrectly filled in. |
500 | internal error, please try later | The server is occasionally abnorma. Usually a retry will succeed. |
FAQ of API submission
1. What is the difference between API submission and the sitemap submission interface?
A: The status feedback is more timely. Previously, after submitting the sitemap.xml, you need to log in to the search resource platform to check whether the submission is successful. Now, status feedback can be told by status codes and returns after submission.
2. What needs to be modified in the existing program for submitting sitemap.xml data?
Answer: There are two modifications. The first is to modify the submitted interface; the second is to process the information returned by the interface. If the submission failed, you need to modify correspondent settings according to the error message. The link that reports the error cannot be submitted successfully.
3. Why couldn’t I see the data change after the successful API submission?
A: The feedback is the number of newly submitted links. If the link has been submitted before (i.e. repeated submission), it will not be counted.
4. When is the best time to use API submission submission?
A: Submit the link immediately when the new link is generated or published.
5. What is the difference between submitting one piece of data at a time and multiple pieces at a time?
A: No difference.
6. What are negative effects of repeatedly submitting posted links?
A: There will be two negative effects. First, your submitted quota will be wasted. There is a limit to the number of submissions for each site per day. Repeatedly submitting old links wastes quota, so new links may not be submitted successfully. Second, if you frequently resubmit old links, Baidu will lower the site’s quota and you may lose access to the API submission function.
7.How many links can I submit to the API submission at most once a time?
Answer: The upper limit depends on the number of new valuable links that you submitted. Baidu will adjust the upper limit from time to time according to the number of links that you submitted. The higher the new valuable links you submit, the higher the limit.