slashdot网站架构:硬件和软件 zz

news/2024/4/27 18:56:52/文章来源:https://blog.csdn.net/ak47mig/article/details/1854404
http://slash.solidot.org/article.pl?sid=07/10/27/1244202&from=rss
 
Solidot网站经常不时出现小毛小病,比如最近留言计数器严重滞后。同样采用 slashcode的slashdot是如何运行的,值得我们参考。它的Alexa排名在800左右(digg现在是100左右,差距越来越大了),每天的流量很惊人。在建站10周年之际,Slashdot的工程师介绍了网站整体架构,分为 硬件和 软件两部分。
硬件:slashdot现在属于SourceForge公司,硬件基本结构与SourceForge旗下其它网站如SourceForge.net,Thinkgeek.com, Freshmeat.net,Linux.com等相同。 一个数据中心,活动地板、发电机、UPS、24x7小时安全防护等等之类,和一般的数据中心一样。
带宽和网络:一对Cisco 7301s路由器,一对Foundry BigIron 8000s交换机,一对Rackable Systems 1Us作负载平衡防火墙:配置P4 Xeon 2.66Gz,2G RAM,2x80GB IDE,运行CentOS和LVS。
16个web服务器,都运行Red Hat 9。2个用于统计内容:脚本,图像,非注册用户看到的首页;4个用于注册用户看到的首页内容;其余10个处理评论页。服务器型号为Rackable 1U:2 Xeon 2.66Ghz处理器,2GB of RAM,2x80GB IDE硬盘...
7个数据库服务器,都运行CentOS 4,配置是2 Dual Opteron 270,16GB RAM,4x36GB 15K RPM SCSI Drives。一个是只写数据库,其余则是读写数据库,它们互相之间可以随时可以动态交换。

软件: HTTP请求需经过 pound servers,pound是一种代理服务器,它会选择一个web server来响应请求。slashdot一共有6个pound,一个是HTTPS加密访问模式(提供给订阅用户),5个都是标准的HTTP。web server使用Apache,数据库是MySQL。Slash 1.0是在2000年初完成的目前的最新版本是2.2.6。
___________________________________________________
http://meta.slashdot.org/article.pl?sid=07/10/22/145209
Today we have Part 2 in our exciting 2 part series about the infrastructure that powers Slashdot. Last week Uriah told us all about the hardware powering the system. This week, Jamie McCarthy picks up the story and tells us about the software... from pound to memcached to mysql and more. Hit that link and read on.
<script type="text/javascript"> var ad6 = 'active'; </script>
<script type="text/javascript"> //<//script>'); dfp_tile++; //]]> </script> <script src="http://ad.doubleclick.net/adj/ostg.slashdot/meta_p6_imu;pg=article;logged_in=0;tile=1;tpc=slashdot;tpc=news;ord=2075612405464217.8?" type="text/javascript" style="display: none;"></script>

The software side of Slashdot takes over at the point where our load balancers -- described in Friday's hardware story -- hand off your incoming HTTP request to our pound servers.

Pound is a reverse proxy, which means it doesn't service the request itself, it just chooses which web server to hand it off to. We run 6 pounds, one for HTTPS traffic and the other 5 for regular HTTP. (Didn't know we support HTTPS, did ya? It's one of the perks for subscribers: you get to read Slashdot on the same webhead that admins use, which is always going to be responsive even during a crush of traffic -- because if it isn't, Rob's going to breathe down our necks!)

The pounds send traffic to one of the 16 apaches on our 16 webheads -- 15 regular, and the 1 HTTPS. Now, pound itself is so undemanding that we run it side-by-side with the apaches. The HTTPS pound handles SSL itself, handing off a plaintext HTTP request to its machine's apache, so the apache it redirects traffic to doesn't need mod_ssl compiled in. One less headache! Of our other 15 webheads, 5 also run a pound, not to distribute load but just for redundancy.

(Trivia: pound normally adds an X-Forwarded-For header, which Slash::Apache substitutes for the (internal) IP of pound itself. But sometimes if you use a proxy on the internet to do something bad, it will send us an X-Forwarded-For header too, which we use to try to track abuse. So we patched pound to insert a special X-Forward-Pound header, so it doesn't overwrite what may come from an abuser's proxy.)

The other 15 webheads are segregated by type. This segregation is mostly what pound is for. We have 2 webheads for static (.shtml) requests, 4 for the dynamic homepage, 6 for dynamic comment-delivery pages (comments, article, pollBooth.pl), and 3 for all other dynamic scripts (ajax, tags, bookmarks, firehose). We segregate partly so that if there's a performance problem or a DDoS on a specific page, the rest of the site will remain functional. We're constantly changing the code and this sets up "performance firewalls" for when us silly coders decide to write infinite loops.

But we also segregate for efficiency reasons like httpd-level caching, and MaxClients tuning. Our webhead bottleneck is CPU, not RAM. We run MaxClients that might seem absurdly low (5-15 for dynamic webheads, 25 for static) but our philosophy is if we're not turning over requests quickly anyway, something's wrong, and stacking up more requests won't help the CPU chew through them any faster.

All the webheads run the same software, which they mount from a /usr/local exported by a read-only NFS machine. Everyone I've ever met outside of this company gives an involuntary shudder when NFS is mentioned, and yet we haven't had any problems since shortly after it was set up (2002-ish). I attribute this to a combination of our brilliant sysadmins and the fact that we only export read-only. The backend task that writes to /usr/local (to update index.shtml every minute, for example) runs on the NFS server itself.

The apaches are versions 1.3, because there's never been a reason for us to switch to 2.0. We compile in mod_perl, and lingerd to free up RAM during delivery, but the only other nonstandard module we use is mod_auth_useragent to keep unfriendly bots away. Slash does make extensive use of each phase of the request loop (largely so we can send our 403's to out-of-control bots using a minimum of resources, and so your page is fully on its way while we write to the logging DB).

Slash, of course, is the open-source perl code that runs Slashdot. If you're thinking of playing around with it, grab a recent copy from CVS: it's been years since we got around to a tarball release. The various scripts that handle web requests access the database through Slash's SQL API, implemented on top of DBD::mysql (now maintained, incidentally, by one of the original Slash 1.0 coders) and of course DBI.pm. The most interesting parts of this layer might be:

(a) We don't use Apache::DBI. We use connect_cached, but actually our main connection cache is the global objects that hold the connections. Some small chunks of data are so frequently used that we keep them around in those objects.

(b) We almost never use statement handles. We have eleven ways of doing a SELECT and the differences are mostly how we massage the results into the perl data structure they return.

(c) We don't use placeholders. Originally because DBD::mysql didn't take advantage of them, and now because we think any speed increase in a reasonably-optimized web app should be a trivial payoff for non-self-documenting argument order. Discuss!

(d) We built in replication support. A database object requested as a reader picks a random slave to read from for the duration of your HTTP request (or the backend task). We can weight them manually, and we have a task that reweights them automatically. (If we do something stupid and wedge a slave's replication thread, every Slash process, across 17 machines, starts throttling back its connections to that machine within 10 seconds. This was originally written to handle slave DBs getting bogged down by load, but with our new faster DBs, that just never happens, so if a slave falls behind, one of us probably typed something dumb at the mysql> prompt.)

(e) We bolted on memcached support. Why bolted-on? Because back when we first tried memcached, we got a huge performance boost by caching our three big data types (users, stories, comment text) and we're pretty sure additional caching would provide minimal benefit at this point. Memcached's main use is to get and set data objects, and Slash doesn't really bottleneck that way.

Slash 1.0 was written way back in early 2000 with decent support for get and set methods to abstract objects out of a database (getDescriptions, subclassed _wheresql) -- but over the years we've only used them a few times. Most data types that are candidates to be objectified either are processed in large numbers (like tags and comments), in ways that would be difficult to do efficiently by subclassing, or have complicated table structures and pre- and post-processing (like users) that would make any generic objectification code pretty complicated. So most data access is done through get and set methods written custom for each data type, or, just as often, through methods that perform one specific update or select.

Overall, we're pretty happy with the database side of things. Most tables are fairly well normalized, not fully but mostly, and we've found this improves performance in most cases. Even on a fairly large site like Slashdot, with modern hardware and a little thinking ahead, we're able to push code and schema changes live quickly. Thanks to running multiple-master replication, we can keep the site fully live even during blocking queries like ALTER TABLE. After changes go live, we can find performance problem spots and optimize (which usually means caching, caching, caching, and occasionally multi-pass log processing for things like detecting abuse and picking users out of a hat who get mod points).

In fact, I'll go further than "pretty happy." Writing a database-backed web site has changed dramatically over the past seven years. The database used to be the bottleneck: centralized, hard to expand, slow. Now even a cheap DB server can run a pretty big site if you code defensively, and thanks to Moore's Law, memcached, and improvements in open-source database software, that part of the scaling issue isn't really a problem until you're practically the size of eBay. It's an exciting time to be coding web applications

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.luyixian.cn/news_show_894782.aspx

如若内容造成侵权/违法违规/事实不符,请联系dt猫网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

最新版谷歌浏览器每次都要设置允许网站使用flash的解决方法

谷歌浏览器69之后的版本&#xff0c;因为flash存在安全隐患问题&#xff0c;对flash做了严格限制&#xff0c;所以每次打开都会提示是否允许使用flash&#xff0c;需要手动设置&#xff0c;关闭浏览器之后又会恢复之前设置&#xff0c;下次再打开的时候&#xff0c;依然会提示这…

程序员实用工具网站

此文为转载文章&#xff0c;原文链接&#xff1a;程序员实用工具网站 另有非常有用的几个网站&#xff0c;参考非常实用的九个程序员工具网站 1、搜索引擎 2、PPT 3、图片操作 4、文件共享 5、应届生招聘 6、程序员面试题库 7、办公、开发软件 8、高清图片、视频素材网站…

SharePoint 2010/2013/2016内容数据库与网站集的关系

总得来说&#xff0c;内容数据库和网站集的关系是&#xff1a; 一个内容数据库里可以有多个网站集&#xff0c;但是一个网站集只能存在于一个内容数据库。 那么问题来了 问题1&#xff1a;我能否在创建网站集时指定内容数据库呢&#xff1f;或者说我能在指定的内容数据库…

SharePoint 2013 设置网站集为”只读”

有时候当我们升级或者部署项目时&#xff0c;不希望用户在此期间操作SharePoint&#xff0c;比如上传文档。 SharePoint提供了这样的功能&#xff1a;管理中心------应用程序管理------管理配额和锁定 完成后&#xff0c;再次打开http://sp2013,页面会有如下提示&#xff1a…

优化网站性能 提高网站速度访问速度的14条实践收藏

相信互联网已经越来越成为人们生活中不可或缺的一部分。ajax&#xff0c;flex等等富客户端的应用使得人们越加“幸福”地体验着许多原先只能在C/S实现的功能。比如Google机会已经把最基本的office应用都搬到了互联网上。当然便利的同时毫无疑问的也使页面的速度越来越慢。自己是…

002_Python基础学习网站

&#xff08;一&#xff09;电脑端&#xff1a;Python 基础教程 &#xff08;二&#xff09;手机端&#xff1a;Python 基础教程

响应式网站开发、公司主页开发

转载自&#xff1a; http://www.5180it.com/pages/company1/html/index.html 网上找了一个响应式的公司网站模板&#xff0c;觉得效果挺好的&#xff0c;和大家分享一下 手机上效果&#xff1a; 自己也通过这个模板&#xff0c;开发了前后台&#xff0c;能够自定义相应的页面…

根据原有的博客模板自己开发了一个开源代码分享的网站

在网上看到一个dimpleBlog的代码&#xff0c;觉得比较好&#xff0c;源码里包括了后台的管理和博客页面&#xff0c; 我在这个基础上开发了自己的一些功能&#xff0c;为此和大家分享一下 http://www.5180it.com/code/index.html 网站首页 网站主页 创作中心 内容编辑页面 这…

分享一个响应式电商网站前端模板

页面简洁大气 具有常用的电商页面&#xff1a;登录、注册、商品列表、商品详情、购物车、订单等页面&#xff0c; 适合个人学习和二次开发&#xff0c;具体页面效果如下 在线演示地址 http://5180it.com/malltemplate/index.html http://5180it.com/malltemplate/index.htm…

Nginx 配置 SSL 证书 + 搭建 HTTPS 网站教程

一、HTTPS 是什么&#xff1f; 根据维基百科的解释&#xff1a; 超文本传输安全协议&#xff08;缩写&#xff1a;HTTPS&#xff0c;英语&#xff1a;Hypertext Transfer Protocol Secure&#xff09;是超文本传输协议和SSL/TLS的组合&#xff0c;用以提供加密通讯及对网络服务…

大型网站系统架构演化之路

前言 一个成熟的大型网站&#xff08;如淘宝、天猫、腾讯等&#xff09;的系统架构并不是一开始设计时就具备完整的高性能、高可用、高伸缩等特性的&#xff0c;它是随着用户量的增加&#xff0c;业务功能的扩展逐渐演变完善的&#xff0c;在这个过程中&#xff0c;开发模式、…

养殖交流网站-物语资源交流

自己搭建的养殖交流网站&#xff1a;物语资源交流。不吝访问&#xff0c;谢谢&#xff01;

适合iOS开发者的15大网站推荐

http://www.csdn.net/article/2015-03-04/2824108-ios-developers-sites/1 https://medium.com/ios-apprentice/11-insanely-great-ios-developers-sites-95686a523ea8 1. objc.io objc.io的文章质量上佳、观点深刻、针对性强&#xff0c;均出自世界顶级iOS工程师之手&#xf…

图解使用VS.NET部署含水晶报表的网站

Crystal Report &#xff0c;中文名称“水晶报表”&#xff0c;因为做报表的功能强大&#xff0c;所以被 Visual Studio.Net 整合进来了。其中 Visual Studio.Net 2002 中的Crystal Decisions 版本为 1.0 &#xff1b;Visual Studio.Net 2002 中的Crystal Decisions 版本为 1.1…

spring cloud+vue在线视频网站 1.搭建网页基本框架

文章目录 1 使用 Eureka 搭建注册中心1.1 生成 Spring boot 项目1.2 导入 eureka 包1.3 添加启动文件及配置文件1.4 修改项目为 Maven 父子结构 2. 搭建业务模块 system2.1 添加 system 模块的基础文件2.2 测试 system 模块功能 3.搭建路由模块3.1 添加 gateway 模块基础文件3.…

spring cloud+vue在线视频网站 2.整合mybatis并将system模块公共组件分离

文章目录 1 system 模块集成 Mybatis1.1 建立数据库1.2 导入mybatis 包1.3 增加数据库连接1.4 测试 Mybatis 集成情况 2 测试 Mybatis 集成情况2.1 java文件2.2 Mybatis xml文件 3 搭建服务模块 Server3.1 新建 server 模块3.2 system 集成 server 模块3.3 集中部署 4 集成 myb…

spring cloud+vue在线视频网站 3.vue搭建管理页面

文章目录 1 使用 vue 创建 admin 项目2 集成 bootstrap 后台管理模板 ace3 集成路由 vue-route4 后台管理页面开发 1 使用 vue 创建 admin 项目 使用 vue 创建前端项目网上有很多教程&#xff0c;这里简单说一下。 首先需要有 npm 和 node&#xff0c;先下载这两个工具。brew …

spring cloud+vue在线视频网站 4.单表查询功能前后端开发

1 大章列表查询后端模块 这里首先新建一个 Maven 模块&#xff0c;取名 busines 。这个模块的具体功能是实现大章列表的查询&#xff0c;在模块中会有启动类和 controller 层&#xff0c;controller 层是前端调用后端的接口&#xff0c;controller 层通过 server 模块中的 ser…