OPINION
[Read our latest rebuttal to Mr Geoffrey Pereira's second article on 13 November 2009 here]
For the first time in the media history of Singapore, the state media actually bothered to put up an article to refute claims made by a blog, or is it?
Straits Times journalist Geoffrey Pereira published an article on his ST blog explaining why SPH is not responsible for the "attack" on Temasek Review.
The irony is - it appeared that he is more interested to convince readers that we had accused them of the attack in the article!
If SPH is really interested in disputing our claims, it would have issued a simple clarification:
"SPH had conducted an internal investigation and found that nobody from SPH was accessing TR during the period when the alleged (grabbing) incident is reported to take place by TR which is between 2200 hours on 31st October and 0100 hours on 1st November."
A statement like the above will be more than suffice to rebuke our allegations and put the matter to rest unless we decide to pursue it.
Unfortunately, Mr Geoffrey chose to launch a counter-accusation against us for "maligning" SPH due to our "ignorance" of our own server.
Mr Geoffrey is half-right in the sense that we are not IT experts which explains why it is impossible for us to fabricate any charges against SPH without information provided by our system administrators in China.
We double-checked with our system administrator and data centre three times the authenticity and accuracy of the server log before we made the decision to release the article.
Our correspondent was told about the "intrusion" in mandarin and he had to translate all the IT jargons which he did not understand into English.
Mr Geoffrey began the article by distorting our words to mislead readers to believe that we are accusing SPH of the attack:
"It (TR) started by defining a Distributed Denial of Service (DDOS) attack – essentially as when a server is bombarded with requests so as to overload and cripple it.
It then went on to say that its monitoring had shown that during a recent period, there was a flurry of network requests coming from an SPH IP address.
Put this together and it is no less than an accusation that SPH had launched an Internet attack on TR."
As our site was down a day earlier due to a DDOS attack, our correspondent took extra precautions to define what DDOS and IP addresses mean so that lay readers will not get the wrong idea that SPH is somewhat responsible for the attack.
Mr Geoffrey deliberately chose to omit the crucial paragraph in our article which stated clearly that the flurry of network requests is "grabbing" content from our site.
"On or about 31st October 2009, around 2200 hours to 1st November 0100 hours, while our system administrator was doing a routine check on the server and firewall, he noticed a flurry of network communication requests coming from one single IP address concurrently which caused our server’s load to increase tremendously.
From the log, it seems to suggest that whoever doing that is using a Web Grabber Software with the aim of getting all the content from our site since a single web browser is unlikely to be reading our entire site all at a go."
Nowhere was it mentioned that it leads to a DDOS attack. How Mr Geoffrey managed to jump to that conclusion is anybody's guess.
Mr Geoffrey then took issue with the URL of our article -
https://www.tremeritus.net/2009/11/02/sph-and-recent-ddos-attack-on-temasek-review/
and wrote:
".....and if SPH is not being accused of a DOS attack, why associate it with this URL title?"
Mr Geoffrey is obviously not a wordpress user. Temasek Review is based on a wordpress blog. WordPress users will know that whenever you post a new article, wordpress will automatically generate a URL for you almost immediately before you even complete typing the first sentence.
Our correspondent had not finalized the title of the article when it was saved as the permalink by wordpress which is not a fault of his.
The article was saved as a draft before it was vetted by another correspondent and our system administrator the next day before it was published on Monday, 2 November 2009.
To give an example, the title of our rebuttal to Mr Geoffrey's article is - "Demolishing SPH’s claims in Mr Geoffrey’s misleading article: “Attack on Temasek Review – not SPH", but the URL to the article is simply:
https://www.tremeritus.net/2009/11/06/a-rebuttal-to-sphs/
because the permalink was saved automatically by wordpress in the first few seconds when the article was being typed which explained why it is entirely different from the title.
It was a technical oversight on our part and we apologize for it. The title of the article is self-explanatory - "SPH IP address caught “grabbing” content from Temasek Review server". There was nothing mentioned about any DDOS attack.
The article simply lay down the facts:
1. A IP address from SPH was detected "grabbing" our site during the stated time period.
2. The flurry of network requests was detected by our firewall and log onto the system.
To use the URL of our article to incriminate us for falsely accusing SPH for DDOS attack does not hold much ground when the article had merely claimed that a SPH IP address was logged by our firewall.
Had we wanted to accuse SPH of launching an attack on us, why should we even bother to define DDOS to our readers in the first place and asked SPH why its IP address was caught "grabbing" our content?
Furthermore, the time period used by Mr Geoffrey is completely off the mark. It was not published in our original article at all even before the amendments to the time was made:
"In fact, from midnight on Nov 1 from 1 am to about 6 am, (covering a period of the alleged attack) no one from SPH accessed the TR site."
Our server log showed that nobody from the IP address belonging to SPH access TR on Nov 1 from 1am to 6am.
Mr Geoffrey is dead wrong: the period stated by him DID NOT cover the period of the alleged attack. We had never published it in our article. Where then did he get the information from?
Mr Geoffrey admitted in the later part of his article that SPH staff did access TR during the period:
"Data made available to me covered a 3-day period starting before and ending after the alleged attack. It showed that about 25 SPH employees – including yours truly, a regular reader – visited TR; but we did not create the kind of flurry of Net activity that would slow a server down, much less precipitate a DOS."
Since Mr Geoffrey did not state categorically that NOBODY from SPH access the TR site from 31st October 10pm to 1st November 1am, we can infer that somebody or bot from SPH did access TR during this period of time based on his above statement.
Mr Geoffrey went on to say our server was not "overloaded" by anybody accessing our data from SPH:
"Neither did anyone in SPH try to “grab” TR material in a way that would load its server; nor did any SPH staffer launch any attack on the server."
Our correspondent did not accuse SPH of overloading or slowing our site. We are only concerned about the flurry of network communication requests coming from the same IP address belong to SPH at one go:
"Fortunately, our new anti-DDOS firewall managed to stop these requests to prevent them from loading the server thus causing it to slow down. A shared server with limited bandwidth would have crashed....
By doing so covertly without asking for our permission and flooding our server with so many network communications request at one go, it will slow down the site, retard the loading speed of the pages and can potentially cause the server to crash (though highly unlikely in our new dedicated server)."
In this instance, the manner by which contents are being accessed is consistent with search robots or a web grabber – ie – a website is archived so that a string search can be made.
While this is perfectly legal, some software uses multiple sockets when downloading content, and CAN potentially hog resources from the web server and slow other user’s access.
It will hog the server’s resources but in this incident, it didn’t because the software firewall on the server itself banned the offending IP address minutes into the action after the IP address exceeded 60 connects per minute, the threshold set by the system administrator.
Technically, if the server were to be not protected by firewall and had been configured poorly, a multiple of requests in excess of 60 connects per minute WOULD HAVE brought the server down and that would technically be classified as an attack.
The above information was provided by our system administrator who has no reasons to fabricate it to discredit SPH. Neither can IT idiots like us produce such a technical explanation on our own.
Mr Geoffrey need only answer the most important question:
Did any SPH employee, bot, spider or whatever access TR between the period when the alleged "grabbing" was said to take place from 10pm (31 Oct) to 1am (1 Nov)?
If the above is true, it means that what was captured on our server log was correct and that would prove our case that an employee, bot, spider or whatever from SPH was indeed accessing or grabbing our "content".
The ball is in SPH's court to reveal the identities of the SPH staff or whatever who/which are accessing / grabbing our site during this period of time. Whether or not it did cause our server to overload is besides the point since our firewall is able to withstand the communication requests from the IP address.
ADDENDUM:
Some readers defended SPH on the ground that its employee or bot might be simply "browsing" our site for "research" purposes and blamed us for hanging him/her out to dry.
However, content "grabbing" is not equivalent to innocucous "browsing" which explained why we raised the alarm bells based on the advice given by our system administrator.
Attached below are two snapshots of our log to explain the difference between the two:
#1 Snapshot from the incident:
As we can see from above, the same IP address 203.116.231.234 which was traced back to SPH was logged to be simultaneously connecting to the server at a VERY SHORT interval, hence the IP was repeatedly logged immediately one after the other.
This is the characteristic of a web grabber kind of software (it may also be sort of a SYNC attack but a SYNC attack would be grabbing the same content instead of multiple), certainly not any browser’s characteristic.
#2 Snapshot taken on 6 November 2009, 9pm during "normal" browsing:
A normal browser reading the site would also show the same IP address as accessing the site but not as repeatative and as close in timing as shown in the snapshot above in #1.
The IP address of the reader will still be logged but the interval between connects would be greater and not one after another.
In the highlighted example, they would be 202.156.13.246 and 202.156.12.228 which are accessing the site ‘normally’.
Since Mr Pereira had admitted that SPH employees were found to be visiting TR during the time period when the "grabbing" incident as shown in snapshot #1 was alleged to take place, he would be able to answer the three most important questions we have been asking all along:
1. The identity of the employee who was "grabbing" content from TR during the stated time period.
2. Is he/she using a web grabber software to do so?
3. What are his/her motives for "grabbing" our site.
All we asked for is an explanation of what really happened. We will close the case once we get the answers we are waiting for.
Related articles:
>> Debunking Mr Geoffrey's claims in his misleading article: "Attack on Temasek Review: not SPH
>> Attack on Temasek Review: not SPH
>> SPH IP caught grabbing "content" from Temasek Review
>> Debunking Mr Geoffrey's claims on "IP spoofing"
Read More →