<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Conviva</title>
	<atom:link href="http://www.conviva.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.conviva.com</link>
	<description>The Global Leader</description>
	<lastBuildDate>Fri, 17 May 2013 15:47:02 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<item>
		<title>Bloomberg Businessweek Innovator: Hui Zhang&#8217;s Hiccup-Free Streaming Video</title>
		<link>http://www.conviva.com/bloomberg-businessweek-innovator-hui-zhang/</link>
		<comments>http://www.conviva.com/bloomberg-businessweek-innovator-hui-zhang/#comments</comments>
		<pubDate>Thu, 07 Mar 2013 21:06:17 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[hidden in the news]]></category>

		<guid isPermaLink="false">http://www.conviva.com/?p=1858</guid>
		<description><![CDATA[By Cliff Edwards There are seven words you can’t say on television, but only one that would-be viewers hate to see fill their online video screens: buffering. Researcher NScreenMedia estimates media companies lose $2 billion a year in ad revenue because 1 in 5 consumers gives up on patchy online videos. Computer science professor Hui Zhang says he can stop that. Zhang’s San Mateo ...]]></description>
				<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-1854" alt="Hui Zhang: Bloomberg" src="http://www.conviva.com/wp-content/uploads/hui-2.jpg" width="490" height="350" />By <a href="http://www.businessweek.com/authors/647-cliff-edwards" rel="author">Cliff Edwards</a></p>
<p>There are seven words you can’t say on television, but only one that would-be viewers hate to see fill their online video screens: buffering. Researcher NScreenMedia estimates media companies lose $2 billion a year in ad revenue because 1 in 5 consumers gives up on patchy online videos. Computer science professor Hui Zhang says he can stop that.</p>
<p>Zhang’s San Mateo (Calif.) startup, Conviva, makes stream-monitoring software that big companies can use to ease network congestion. Able to oversee millions of connections at a time—as it did during its first big test, the 2008 Olympics—the software determines whether incoming video is lagging what’s on the screen. If it’s slower, Conviva compensates by lowering the resolution or switching the viewer to a different server. While companies monitoring network capacity typically keep tabs only on their own server loads, “If you have a real-time sense of what’s happening with each video player, there are ways to infer what is happening inside the network and decide what optimization steps to take,” says Zhang, 45, who teaches at Carnegie Mellon University in Pittsburgh.</p>
<p>Born in Harbin in northeastern China, Zhang enrolled in a science and technology university at 16. He studied computer science on his high school math teacher’s advice. In 1989, a year after starting doctoral studies at University of California at Berkeley, Zhang joined a group researching how to improve online video quality. “It was clear the killer app was going to be video,” he says, and U.S. technology was in the lead. His research on peer-to-peer streaming, including a test that brought clear video to thousands of users at once, planted the seeds for Conviva, started in 2006 as Rinera Networks. Staffers pulled some all-nighters to fix problems with software Conviva sold to its first clients, Yahoo! (<a href="http://investing.businessweek.com/research/stocks/snapshot/snapshot.asp?ticker=YHOO" data-symbol="YHOO">YHOO</a>) and Italy’s KataWeb, but hasn’t had such issues since, Zhang says.</p>
<p>Media companies including HBO (<a href="http://investing.businessweek.com/research/stocks/snapshot/snapshot.asp?ticker=TWX" data-symbol="TWX">TWX</a>), NBCUniversal (<a href="http://investing.businessweek.com/research/stocks/snapshot/snapshot.asp?ticker=CMCSA" data-symbol="CMCSA">CMCSA</a>), and Yahoo! have been using Zhang’s software in their video streaming applications for years, some since 2007. His company got its first big break when Zhang convinced former NBCUniversal Chief Technology Officer Darren Feher that he’d need Conviva’s live monitoring systems to manage streams of the Beijing Olympics, which totaled about 100 million over two weeks. In 2009, Feher signed on as Conviva’s chief executive officer. The company—which has raised $59 million in venture capital since 2006, including a recent $15 million led by Time Warner Investments—has 75 employees and revenue in the “double-digit millions,” Feher says, declining to elaborate.</p>
<p>He adds that Conviva, which now oversees 3 billion video streams monthly, negotiates fees differently with each client. Colin Dixon, chief analyst for NScreenMedia, says Conviva’s strength lies in adding viewer analytics like those of Adobe (<a href="http://investing.businessweek.com/research/stocks/snapshot/snapshot.asp?ticker=ADBE" data-symbol="ADBE">ADBE</a>) or Ooyala to content delivery akin to Akamai (<a href="http://investing.businessweek.com/research/stocks/snapshot/snapshot.asp?ticker=AKAM" data-symbol="AKAM">AKAM</a>). It has no direct competitors, he says. “It’s really tough to process and deliver video at the same time.”</p>
<h2>Problem</h2>
<p>Twenty percent of streaming video viewers abandon slow feeds</p>
<h2>Solution</h2>
<p>Software shuffles servers and tweaks resolution</p>
<h2>Who’s buying</h2>
<p>Clients include HBO, NBCUniversal, and Yahoo!</p>
<p><a href="http://www.businessweek.com/articles/2013-03-07/innovator-hui-zhangs-hiccup-free-streaming-video" class="button_link hover_fade alignleft" style="background-color:#000000;border-color:#000000;color:#ffffff;"><span>Read to the story</span></a></p>
<div class="divider"></div>
<p><img class="size-full wp-image-1795 alignnone" alt="businessweek-icon" src="http://www.conviva.com/wp-content/uploads/businessweek-icon.jpg" width="243" height="92" /></p>
<p><img class="size-full wp-image-1849 alignleft" alt="Hui Zang Bloomberg Businessweek" src="http://www.conviva.com/wp-content/uploads/homepage-image2.jpg" width="196" height="239" /></p>
]]></content:encoded>
			<wfw:commentRss>http://www.conviva.com/bloomberg-businessweek-innovator-hui-zhang/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Bloomberg Businessweek Innovator: Hui Zhang&#8217;s Hiccup-Free Streaming Video</title>
		<link>http://www.conviva.com/innovator-hui-zhangs-hiccup-free-streaming-video/</link>
		<comments>http://www.conviva.com/innovator-hui-zhangs-hiccup-free-streaming-video/#comments</comments>
		<pubDate>Thu, 07 Mar 2013 15:47:50 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[hidden in the news]]></category>

		<guid isPermaLink="false">http://www.conviva.com/?p=1808</guid>
		<description><![CDATA[By Cliff Edwards There are seven words you can’t say on television, but only one that would-be viewers hate to see fill their online video screens: buffering. Researcher NScreenMedia estimates media companies lose $2 billion a year in ad revenue because 1 in 5 consumers gives up on patchy online videos. Computer science professor Hui Zhang says he can stop that. Zhang’s San Mateo ...]]></description>
				<content:encoded><![CDATA[<p><img class="alignleft size-full wp-image-1854" alt="Hui Zhang: Bloomberg" src="http://www.conviva.com/wp-content/uploads/hui-2.jpg" width="490" height="350" />By <a href="http://www.businessweek.com/authors/647-cliff-edwards" rel="author">Cliff Edwards</a></p>
<p>There are seven words you can’t say on television, but only one that would-be viewers hate to see fill their online video screens: buffering. Researcher NScreenMedia estimates media companies lose $2 billion a year in ad revenue because 1 in 5 consumers gives up on patchy online videos. Computer science professor Hui Zhang says he can stop that.</p>
<p>Zhang’s San Mateo (Calif.) startup, Conviva, makes stream-monitoring software that big companies can use to ease network congestion. Able to oversee millions of connections at a time—as it did during its first big test, the 2008 Olympics—the software determines whether incoming video is lagging what’s on the screen. If it’s slower, Conviva compensates by lowering the resolution or switching the viewer to a different server. While companies monitoring network capacity typically keep tabs only on their own server loads, “If you have a real-time sense of what’s happening with each video player, there are ways to infer what is happening inside the network and decide what optimization steps to take,” says Zhang, 45, who teaches at Carnegie Mellon University in Pittsburgh.</p>
<p>Born in Harbin in northeastern China, Zhang enrolled in a science and technology university at 16. He studied computer science on his high school math teacher’s advice. In 1989, a year after starting doctoral studies at University of California at Berkeley, Zhang joined a group researching how to improve online video quality. “It was clear the killer app was going to be video,” he says, and U.S. technology was in the lead. His research on peer-to-peer streaming, including a test that brought clear video to thousands of users at once, planted the seeds for Conviva, started in 2006 as Rinera Networks. Staffers pulled some all-nighters to fix problems with software Conviva sold to its first clients, Yahoo! (<a href="http://investing.businessweek.com/research/stocks/snapshot/snapshot.asp?ticker=YHOO" data-symbol="YHOO">YHOO</a>) and Italy’s KataWeb, but hasn’t had such issues since, Zhang says.</p>
<p>Media companies including HBO (<a href="http://investing.businessweek.com/research/stocks/snapshot/snapshot.asp?ticker=TWX" data-symbol="TWX">TWX</a>), NBCUniversal (<a href="http://investing.businessweek.com/research/stocks/snapshot/snapshot.asp?ticker=CMCSA" data-symbol="CMCSA">CMCSA</a>), and Yahoo! have been using Zhang’s software in their video streaming applications for years, some since 2007. His company got its first big break when Zhang convinced former NBCUniversal Chief Technology Officer Darren Feher that he’d need Conviva’s live monitoring systems to manage streams of the Beijing Olympics, which totaled about 100 million over two weeks. In 2009, Feher signed on as Conviva’s chief executive officer. The company—which has raised $59 million in venture capital since 2006, including a recent $15 million led by Time Warner Investments—has 75 employees and revenue in the “double-digit millions,” Feher says, declining to elaborate.</p>
<p>He adds that Conviva, which now oversees 3 billion video streams monthly, negotiates fees differently with each client. Colin Dixon, chief analyst for NScreenMedia, says Conviva’s strength lies in adding viewer analytics like those of Adobe (<a href="http://investing.businessweek.com/research/stocks/snapshot/snapshot.asp?ticker=ADBE" data-symbol="ADBE">ADBE</a>) or Ooyala to content delivery akin to Akamai (<a href="http://investing.businessweek.com/research/stocks/snapshot/snapshot.asp?ticker=AKAM" data-symbol="AKAM">AKAM</a>). It has no direct competitors, he says. “It’s really tough to process and deliver video at the same time.”</p>
<h2>Problem</h2>
<p>Twenty percent of streaming video viewers abandon slow feeds</p>
<h2>Solution</h2>
<p>Software shuffles servers and tweaks resolution</p>
<h2>Who’s buying</h2>
<p>Clients include HBO, NBCUniversal, and Yahoo!</p>
<p><a href="http://www.businessweek.com/articles/2013-03-07/innovator-hui-zhangs-hiccup-free-streaming-video" class="button_link hover_fade alignleft" style="background-color:#000000;border-color:#000000;color:#ffffff;"><span>Read the story on Businessweek</span></a></p>
<div class="divider"></div>
]]></content:encoded>
			<wfw:commentRss>http://www.conviva.com/innovator-hui-zhangs-hiccup-free-streaming-video/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Using Spark and Hive to process BigData at Conviva</title>
		<link>http://www.conviva.com/using-spark-and-hive-to-process-bigdata-at-conviva/</link>
		<comments>http://www.conviva.com/using-spark-and-hive-to-process-bigdata-at-conviva/#comments</comments>
		<pubDate>Tue, 27 Dec 2011 16:56:19 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Engineering Blog]]></category>

		<guid isPermaLink="false">http://radi8.conviva.com/?p=994</guid>
		<description><![CDATA[Dilip Joseph -  Conviva monitors and optimizes tens of millions of online video streams daily for premium video brands. Through Conviva Pulse, our online video dashboard, customers analyze how their online video is being consumed. For example, customers can in real-time identify the most popular videos being watched and adjust their advertising strategy. As another example, customer ops teams can ...]]></description>
				<content:encoded><![CDATA[<p><em><strong>Dilip Joseph - </strong></em></p>
<p>Conviva monitors and optimizes tens of millions of online video streams daily for premium video brands. Through Conviva Pulse, our online video dashboard, customers analyze how their online video is being consumed. For example, customers can in real-time identify the most popular videos being watched and adjust their advertising strategy. As another example, customer ops teams can in real-time detect problems degrading the experience of users watching a live basketball game (say, high buffering due to an overloaded CDN) and quickly take corrective action before the game ends. In addition to live monitoring, our customers can also analyze historical video trends &#8211; what were the most popular videos last week and how long were they watched on average?</p>
<p>Our customers also ask questions that require very deep and often ad-hoc analysis &#8211; <em>&#8220;Something seems to be wrong with my video delivery last week. Any idea what is going on?&#8221;  </em>The video analysis team at Conviva digs through terabytes of data to provide detailed responses to such questions. This post describes how we use Hive and Spark to make this happen.</p>
<p>First, here is an overview of how data flows through the Conviva infrastructure. Conviva monitoring and optimization code embedded within our customers&#8217; video players collects non-personally identifiable information like buffering time and frame rate, and sends it to Conviva&#8217;s backend infrastructure every few seconds. The data entering our backend is replicated into two streams. One stream goes into our custom live data processing stack written in Java. This live system solves the challenging task of analyzing multiple millions of concurrent video streams with a 1 second latency, and deserves its own separate blog post. The other stream is written into our Hadoop File System (HDFS) cluster. MapReduce jobs summarize this raw data and extract out the historical trending information shown in Pulse. The raw data written to HDFS, together with the summaries calculated by the MapReduce jobs, constitute the input to our Hive and Spark based offline reporting and ad-hoc analysis infrastructure, which is described next.</p>
<p><img class="alignnone size-full wp-image-1060" title="ConvivaDataProcessingArchitecture-thumb" src="http://radi8.conviva.com/wp-content/uploads/ConvivaDataProcessingArchitecture-thumb.png" alt="" width="740" height="370" /></p>
<p>&nbsp;</p>
<p>We first started with Hive. Hive provides a SQL-like interface on top of data stored in HDFS. For example, we can find the number of sessions per video on a particular day with the following Hive query:</p>
<hr />
<pre>SELECT videoName, COUNT(1)
FROM summaries
WHERE date='2011_12_12' AND customer='XYZ'
GROUP BY videoName;</pre>
<hr />
<p>Our analysis team uses Hive queries to dig deep into the data and answer customer questions. This process often involves tens of ad-hoc queries, most of them filling up multiple A4-sheets of paper. The most commonly used queries are packaged up together into reports on our internal analysis web-portal so that engineers and customer service representatives can easily run them. Our analysis team also uses Hive queries to measure and improve the performance of our video optimization algorithms, and to gain value insights into the online video ecosystem and user behavior. We have published some of our findings in the research paper titled <a href="http://conferences.sigcomm.org/sigcomm/2011/papers/sigcomm/p362.pdf" target="_blank">&#8220;Understanding the Impact of Video Quality on User Engagement&#8221;</a>, published at SIGCOMM 2011.</p>
<p>Hive works great for performing ad-hoc queries against large amounts of data stored in HDFS. The only problem is that queries often take a long time, as they must read data from HDFS. Most of the time, we run multiple queries against the same data set. Hive re-reads the data from HDFS each time.  This is highly wasteful and slow. This is where Spark helps.</p>
<p><a href="http://www.spark-project.org/" target="_blank">Spark</a> is an in-memory distributed computing framework being developed at UC Berkeley. Spark allows us to load the data of interest from HDFS (or any persistent storage) into RAM across multiple servers and cache it. We can then perform multiple queries against the cached data. Since the data is in RAM, queries are super quick. If a node dies, Spark automatically reconstructs the data from persistent storage.</p>
<p>Queries in Spark are not SQL-like queries as in Hive. Instead, they are written in Scala. The following Spark query produces the same results as the video count Hive query we saw earlier.</p>
<hr />
<pre>val sessions = sparkContext.sequenceFile[SessionSummary, NullWritable](pathToSessionSummaryOnHdfs, 
                   classOf[SessionSummary], classOf[NullWritable]).flatMap {
                       case (key, val) =&gt; val.fieldsOfInterest
               }
val cachedSessions = sessions.filter(whereConditionToFilterSessionsForTheDesiredDay).cache
val mapFn : SessionSummary =&gt; (String, Long) = { s =&gt; (s.videoName, 1) }
val reduceFn : (Long, Long) =&gt; Long = { (a,b) =&gt; a+b }
val results = sessionsForQuery.map(mapFn).reduceByKey(reduceFn).collectAsMap</pre>
<hr />
<p>The above Scala code first reads the session summaries from HDFS. We only keep the subset of fields that are necessary for the query in order to conserve memory. We then filter out sessions that do not match the day and customer of interest. We now have our working set, and instruct Spark to cache it. After that, we start up our group by video name query which is defined as a map-reduce job.  At this point, Spark reads the data from HDFS, caches it, executes the map-reduce job and produces the results.</p>
<p>If we run just one query, Spark is not much faster than Hive. Reading the data from HDFS is the bottleneck in both cases. Most of the time, however, we have to issue multiple queries on the same data &#8211; say, we want to find the video counts per country, state and city in addition to the video name.  In that scenario, Spark is much faster than Hive. Since the data is already cached in memory, subsequent queries complete very quickly. One of our reports which produces detailed video data analysis for a geographic region took 24 hours to run using Hive. After converting to Spark, the same report runs in under 45 minutes. The tremendous savings produced by Spark have helped us run more reports than we could have with our existing Hadoop cluster. Today, over 30% of the reports we run on a daily basis use Spark.</p>
<p>For simple queries, writing the query in Spark is harder than writing it in Hive. However, it is much easier to write complicated queries in Spark than in Hive. Most real-world queries tend to be quite complex; hence the benefit of Spark. We can leverage the full power of the Scala programming language, rather than relying on the limited syntax offered by SQL. For example, the Hive expression <em>IF(os=1, &#8220;Windows&#8221;, IF(os=2, &#8220;OSX&#8221;, IF(os=3, &#8220;Linux&#8221;, &#8220;Unknown&#8221;)))</em> can be replaced by a simple <em>match</em> clause in Scala. You can also use any Java/Scala library to transform the data. Moreover, an ongoing project at UC Berkeley called Shark, will soon make it possible to run Hive queries on top of Spark.</p>
<p>Overall, Spark has been a great boon for data processing operations at Conviva. We are currently working on leveraging Spark in many more ways, including real-time analysis and anomaly detection. If you are interested in working on large-scale data processing systems,<a href="http://www.conviva.com/jobs">we are hiring</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.conviva.com/using-spark-and-hive-to-process-bigdata-at-conviva/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
