<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Irregular Time Series? No. Oversampling.</title>
	<atom:link href="http://www.excelcharts.com/blog/irregular-time-series-oversampling/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.excelcharts.com/blog/irregular-time-series-oversampling/</link>
	<description>Effective Charts and Dashboards for Excel users</description>
	<lastBuildDate>Fri, 03 Feb 2012 17:31:47 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
<xhtml:meta xmlns:xhtml="http://www.w3.org/1999/xhtml" name="robots" content="noindex" />
	<item>
		<title>By: Readings Round-Up #5 &#8211; mutually occluded</title>
		<link>http://www.excelcharts.com/blog/irregular-time-series-oversampling/#comment-1179</link>
		<dc:creator>Readings Round-Up #5 &#8211; mutually occluded</dc:creator>
		<pubDate>Thu, 12 Mar 2009 16:31:38 +0000</pubDate>
		<guid isPermaLink="false">http://www.excelcharts.com/blog/?p=697#comment-1179</guid>
		<description>[...] Irregular Time Series? No. Oversampling. &#124; Jorge Camoes&#8217; Charts &#8220;If you are a market researcher, and you want to make sure that you get more reliable results for a subgroup in a survey, what do you do? You must increase the overall sample size (and spend a lot of money), right? Actually, you don’t. You can oversample that group only, and then weight it down to its known proportion in the population. For example, you may want to increase the number of managers and decrease the number of housewives (because the former are usually more heterogeneous than the latter). Oversampling is a common research method, and a very cost-effective way to get precise estimates for a subgroup.&#8221; [...]</description>
		<content:encoded><![CDATA[<p>[...] Irregular Time Series? No. Oversampling. | Jorge Camoes&#8217; Charts &#8220;If you are a market researcher, and you want to make sure that you get more reliable results for a subgroup in a survey, what do you do? You must increase the overall sample size (and spend a lot of money), right? Actually, you don’t. You can oversample that group only, and then weight it down to its known proportion in the population. For example, you may want to increase the number of managers and decrease the number of housewives (because the former are usually more heterogeneous than the latter). Oversampling is a common research method, and a very cost-effective way to get precise estimates for a subgroup.&#8221; [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jonah Feld</title>
		<link>http://www.excelcharts.com/blog/irregular-time-series-oversampling/#comment-1178</link>
		<dc:creator>Jonah Feld</dc:creator>
		<pubDate>Wed, 11 Mar 2009 18:02:52 +0000</pubDate>
		<guid isPermaLink="false">http://www.excelcharts.com/blog/?p=697#comment-1178</guid>
		<description>@Jorge

I think some of my comment was ambiguous. When I wrote, &quot;what happened between most measurements&quot; I was referring to missing measurements, not the time within an interval during which there are no measurements.

Agreed, balance is the key. For any continuous variable, you have to ask if smaller intervals would change the signal or just introduce noise. You&#039;re completely right, the trade-off is required for any chart of continuous data.

You got me thinking though. Is Nasdaq value at close really continuous? Nasdaq value during market hours is continuous. With unequal interval observations you would definitely have a continuous function. But if you rigidly define the interval, does that change the problem? If so, does that add more importance for Stephen&#039;s argument in favor of equal intervals?

Maybe the Nasdaq example is confusing. Is there a difference (of problem attributes, not actual values) between the questions of, &quot;How many people are in an office building at any one time today?&quot; and, &quot;How many people showed up to work today?&quot;</description>
		<content:encoded><![CDATA[<p>@Jorge</p>
<p>I think some of my comment was ambiguous. When I wrote, &#8220;what happened between most measurements&#8221; I was referring to missing measurements, not the time within an interval during which there are no measurements.</p>
<p>Agreed, balance is the key. For any continuous variable, you have to ask if smaller intervals would change the signal or just introduce noise. You&#8217;re completely right, the trade-off is required for any chart of continuous data.</p>
<p>You got me thinking though. Is Nasdaq value at close really continuous? Nasdaq value during market hours is continuous. With unequal interval observations you would definitely have a continuous function. But if you rigidly define the interval, does that change the problem? If so, does that add more importance for Stephen&#8217;s argument in favor of equal intervals?</p>
<p>Maybe the Nasdaq example is confusing. Is there a difference (of problem attributes, not actual values) between the questions of, &#8220;How many people are in an office building at any one time today?&#8221; and, &#8220;How many people showed up to work today?&#8221;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: admin</title>
		<link>http://www.excelcharts.com/blog/irregular-time-series-oversampling/#comment-1177</link>
		<dc:creator>admin</dc:creator>
		<pubDate>Mon, 09 Mar 2009 22:56:34 +0000</pubDate>
		<guid isPermaLink="false">http://www.excelcharts.com/blog/?p=697#comment-1177</guid>
		<description>Jonah: &quot;Implying that the movement from one point to the next is gradual is misleading.&quot; You are right, but with continuous data we must find the right balance between level of detail and the data we need to answer a specific question. Using line charts to display Nasdaq at close or decennial census data could be misleading because we are assuming a non-existent graduality, but how can we ever use a line chart if we can&#039;t accept this trade-off?</description>
		<content:encoded><![CDATA[<p>Jonah: &#8220;Implying that the movement from one point to the next is gradual is misleading.&#8221; You are right, but with continuous data we must find the right balance between level of detail and the data we need to answer a specific question. Using line charts to display Nasdaq at close or decennial census data could be misleading because we are assuming a non-existent graduality, but how can we ever use a line chart if we can&#8217;t accept this trade-off?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jonah Feld</title>
		<link>http://www.excelcharts.com/blog/irregular-time-series-oversampling/#comment-1176</link>
		<dc:creator>Jonah Feld</dc:creator>
		<pubDate>Mon, 09 Mar 2009 18:10:54 +0000</pubDate>
		<guid isPermaLink="false">http://www.excelcharts.com/blog/?p=697#comment-1176</guid>
		<description>Recording temperature is a very different case that what John&#039;s stamp chart shows. The stamp measurements are not observations; they are the dates at which the price of stamps changed and the time in days until the next change.  Necessarily, they are not equal intervals.

John&#039;s x-axis (1st chart) does have equal intervals: 22 months. It&#039;s a strange interval, but it is equal. The data are plotted correctly - the date at which the price change occurred.

Stephen&#039;s criticism is connecting the measurements with a line. The truth is, you don&#039;t know what happened between most measurements. Implying that the movement from one point to the next is gradual is misleading.

John recognizes this and his solution is the Step Chart (2nd chart). In the case of the stamp prices, you &lt;i&gt;do&lt;/i&gt; know what happened to price between the dates: nothing. I think the step chart is completely appropriate.

Very misleading are time series charts where the x-axis is categorical but is implied as being continuous, like the second chart in Stephen&#039;s newsletter article (x-axis of 1997, 1998, 2000). But this is not what John did.</description>
		<content:encoded><![CDATA[<p>Recording temperature is a very different case that what John&#8217;s stamp chart shows. The stamp measurements are not observations; they are the dates at which the price of stamps changed and the time in days until the next change.  Necessarily, they are not equal intervals.</p>
<p>John&#8217;s x-axis (1st chart) does have equal intervals: 22 months. It&#8217;s a strange interval, but it is equal. The data are plotted correctly &#8211; the date at which the price change occurred.</p>
<p>Stephen&#8217;s criticism is connecting the measurements with a line. The truth is, you don&#8217;t know what happened between most measurements. Implying that the movement from one point to the next is gradual is misleading.</p>
<p>John recognizes this and his solution is the Step Chart (2nd chart). In the case of the stamp prices, you <i>do</i> know what happened to price between the dates: nothing. I think the step chart is completely appropriate.</p>
<p>Very misleading are time series charts where the x-axis is categorical but is implied as being continuous, like the second chart in Stephen&#8217;s newsletter article (x-axis of 1997, 1998, 2000). But this is not what John did.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jayson</title>
		<link>http://www.excelcharts.com/blog/irregular-time-series-oversampling/#comment-1175</link>
		<dc:creator>Jayson</dc:creator>
		<pubDate>Sat, 07 Mar 2009 16:34:40 +0000</pubDate>
		<guid isPermaLink="false">http://www.excelcharts.com/blog/?p=697#comment-1175</guid>
		<description>Jorge-

Not related to this post, but I&#039;m pretty sure you could find some use for this dilbert cartoon http://dilbert.com/strips/comic/2009-03-07/</description>
		<content:encoded><![CDATA[<p>Jorge-</p>
<p>Not related to this post, but I&#8217;m pretty sure you could find some use for this dilbert cartoon <a href="http://dilbert.com/strips/comic/2009-03-07/" rel="nofollow">http://dilbert.com/strips/comic/2009-03-07/</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: ev</title>
		<link>http://www.excelcharts.com/blog/irregular-time-series-oversampling/#comment-1174</link>
		<dc:creator>ev</dc:creator>
		<pubDate>Tue, 24 Feb 2009 18:32:58 +0000</pubDate>
		<guid isPermaLink="false">http://www.excelcharts.com/blog/?p=697#comment-1174</guid>
		<description>In science: if you are plotting observations/measurements (e.g. temperature), and the data plotted is exactly the data taken, then you are being completely honest about the &quot;plot&quot;. Also, typically, to be taken seriously, you have an obligation to assess and describe the sources of error in the entire measurement system.

In business: taking measurements is usually much more arbitrary and the context and purpose is crucial though typically obscure (I work in a LARGE corporation). Attempting to investigate and assess the sources of error is often difficult, time consuming, and may be opposed by those who own the &quot;measurement&quot; systems/data sources (among others!).

Worse, the people who ask for a report may want it in the morning but have gone home, so it can be difficult to be sure what they want to know. (I have found that a big source of error is simply the meaning of words commonly used. I once tried to make a list of all the ways the term &quot;cost&quot; was used in my area, and quickly gave up. The implications for people creating and marketing dashboards in their organizations are significant. )

In my job, a major source of time uncertainty is that, of necessity, systems produce data that is consumed by other systems that in turn feed others. Some of it may be entered by hand, and people can be inconsistent in getting that bit of their job done. So there can be many different (and inconsistent) lags involved in piecing together a report. In such circumstances, getting equal time intervals, much less an accurate picture of even simple data, is sometimes a bit of a struggle!

As far as &quot;inconsistently manipulating the sizes of intervals&quot;, that is simply fraud, and sounds like Few is addressing a somewhat different topic, yes?</description>
		<content:encoded><![CDATA[<p>In science: if you are plotting observations/measurements (e.g. temperature), and the data plotted is exactly the data taken, then you are being completely honest about the &#8220;plot&#8221;. Also, typically, to be taken seriously, you have an obligation to assess and describe the sources of error in the entire measurement system.</p>
<p>In business: taking measurements is usually much more arbitrary and the context and purpose is crucial though typically obscure (I work in a LARGE corporation). Attempting to investigate and assess the sources of error is often difficult, time consuming, and may be opposed by those who own the &#8220;measurement&#8221; systems/data sources (among others!).</p>
<p>Worse, the people who ask for a report may want it in the morning but have gone home, so it can be difficult to be sure what they want to know. (I have found that a big source of error is simply the meaning of words commonly used. I once tried to make a list of all the ways the term &#8220;cost&#8221; was used in my area, and quickly gave up. The implications for people creating and marketing dashboards in their organizations are significant. )</p>
<p>In my job, a major source of time uncertainty is that, of necessity, systems produce data that is consumed by other systems that in turn feed others. Some of it may be entered by hand, and people can be inconsistent in getting that bit of their job done. So there can be many different (and inconsistent) lags involved in piecing together a report. In such circumstances, getting equal time intervals, much less an accurate picture of even simple data, is sometimes a bit of a struggle!</p>
<p>As far as &#8220;inconsistently manipulating the sizes of intervals&#8221;, that is simply fraud, and sounds like Few is addressing a somewhat different topic, yes?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: admin</title>
		<link>http://www.excelcharts.com/blog/irregular-time-series-oversampling/#comment-1173</link>
		<dc:creator>admin</dc:creator>
		<pubDate>Mon, 23 Feb 2009 18:19:58 +0000</pubDate>
		<guid isPermaLink="false">http://www.excelcharts.com/blog/?p=697#comment-1173</guid>
		<description>@nixnut: there are many ways to deceive with charts, and the arbitrary use of irregular intervals is one of them. But we can&#039;t stop using a chart format just because some people use it the wrong way. &quot;Full-disclosure&quot; should always be there by design, but some redundancy may be needed in such cases.</description>
		<content:encoded><![CDATA[<p>@nixnut: there are many ways to deceive with charts, and the arbitrary use of irregular intervals is one of them. But we can&#8217;t stop using a chart format just because some people use it the wrong way. &#8220;Full-disclosure&#8221; should always be there by design, but some redundancy may be needed in such cases.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: nixnut</title>
		<link>http://www.excelcharts.com/blog/irregular-time-series-oversampling/#comment-1172</link>
		<dc:creator>nixnut</dc:creator>
		<pubDate>Mon, 23 Feb 2009 17:28:19 +0000</pubDate>
		<guid isPermaLink="false">http://www.excelcharts.com/blog/?p=697#comment-1172</guid>
		<description>&lt;blockquote cite=&quot;Stephen Few&quot;&gt;How could we trust graphical representations of time series or frequency distributions if their shapes could have been altered by inconsistently manipulating the sizes of intervals along the scale, either arbitrarily or intentionally to deceive?&lt;/blockquote&gt;
The above question is what the issue really is about: trust in the truthfulness of a chart. If you have a good reason to use irregular intervals then use them only if you also make perfectly clear what you do in creating your chart and why. Without such kind of &#039;full disclosure&#039; your readers simply can&#039;t trust your chart. Also, you could also give thought to rearranging your irregular data to fit regular intervals (for example by interpolating where necessary and clearly marking which data points are real and which are derived).</description>
		<content:encoded><![CDATA[<blockquote cite="Stephen Few"><p>How could we trust graphical representations of time series or frequency distributions if their shapes could have been altered by inconsistently manipulating the sizes of intervals along the scale, either arbitrarily or intentionally to deceive?</p></blockquote>
<p>The above question is what the issue really is about: trust in the truthfulness of a chart. If you have a good reason to use irregular intervals then use them only if you also make perfectly clear what you do in creating your chart and why. Without such kind of &#8216;full disclosure&#8217; your readers simply can&#8217;t trust your chart. Also, you could also give thought to rearranging your irregular data to fit regular intervals (for example by interpolating where necessary and clearly marking which data points are real and which are derived).</p>
]]></content:encoded>
	</item>
</channel>
</rss>

<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Minified using disk: basic (Feed is rejected)
Page Caching using disk: enhanced
Content Delivery Network via Amazon Web Services: CloudFront: charts4.excelcharts.com

Served from: www.excelcharts.com @ 2012-02-07 22:41:06 -->
