Why Does GTX 1050 Out Perform GTX 660? | Tom's Hardware Forum

Có thể bạn quan tâm

Search

Everywhere Threads This forum This thread Search titles only Note By: Search Advanced search… Everywhere Threads This forum This thread Search titles only By: Search Advanced… Log in Register What's new

Search

Follow along with the video below to see how to install our site as a web app on your home screen.

Note: This feature may not be available in some browsers.

Home
Featured
Forums New posts Search forums
What's new Featured content New posts Latest activity

Get Tom's Hardware Premium for less than $3 a month — access our hardware roadmaps, deeper analysis, exclusive features, and unlock the Bench database!

Why does GTX 1050 out perform GTX 660?

Thread starter Thread starter ghost47
Start date Start date Nov 16, 2018

Toggle sidebar Toggle sidebar

Home
Forums
Hardware
Graphics Cards

You are using an out of date browser. It may not display this or other websites correctly.You should upgrade or use an alternative browser. G

ghost47

Honorable

Dec 4, 2013 4 0 10,510

I saw the spec sheet on TechPowerUp (links at the end), and the 660 has better Memory Bus, Bandwidth, Texture Rate, more GFlops (or TFlops), Shading Units, and TMUs. The only thing better in 1050 is its Pixel Rate, ROPs and Clock Speeds. And obviously its processor size is smaller (better) and hence more transistors. What makes 1050 so good that it can outperform not only 660 but also 760(2GBs), despite having lower specs on paper. Is it, that clock speeds,pixel rates and number of transistors only matters? How should I judge a graphics card on paper? PS: Outperform in terms of fps. GTX 1050 specs: https://www.techpowerup.com/gpu-specs/evga-gtx-1050-sc-acx-2-0.b3900 GTX 660 specs: https://www.techpowerup.com/gpu-specs/evga-gtx-660.b1428 Solution GTX660 is Kepler architecture. Kepler has 192 cores pers SMX which can't use more than 128 of them efficiently if CUDA kernel is not optimized much. Also each SMX has only 200 GB/s bandwidth. GTX1050 is Pascal architecture. Pascal has 128 cores per SMX with better SMX design so that all cores are much much more better utilized with same CUDA code. It has much higher shared memory bandwidth per SMX. Since both GPUs have 5 SMX units and Pascal is better, GTX1050 is a better GPU. Do you know that in first versions of quake games there was a software calculation for "inverse-square-root" on CPU? Inverse square root or just square root are important for gaming and rendering. When you use them, you better have more of them. Lets... Sort by date Sort by votes

Tugrul_512bit

Distinguished

Nov 19, 2013 43 6 18,545

GTX660 is Kepler architecture. Kepler has 192 cores pers SMX which can't use more than 128 of them efficiently if CUDA kernel is not optimized much. Also each SMX has only 200 GB/s bandwidth. GTX1050 is Pascal architecture. Pascal has 128 cores per SMX with better SMX design so that all cores are much much more better utilized with same CUDA code. It has much higher shared memory bandwidth per SMX. Since both GPUs have 5 SMX units and Pascal is better, GTX1050 is a better GPU. Do you know that in first versions of quake games there was a software calculation for "inverse-square-root" on CPU? Inverse square root or just square root are important for gaming and rendering. When you use them, you better have more of them. Lets compare GTX660 and GTX1050 now: GTX660: has 32 SFU units (that can do inverse square root) per SMX. GTX1050: has 32 SFU units (.....) with lower latency AND higher frequency. Maybe 100ish equivalent. So, for any optimized code(gaming or not), GTX1050's in-GPU fast memory and CUDA cores and all graphics related pipelines are much more superior than GTX660. For any unoptimized code, GDDR memory bandwidth becomes really important so GTX660 can have some advantage here. Also I think GTX1050's bigger cache is relaxing this gap so GTX660 loses again. GTX1050: - much more inverse-square-root throughput per CUDA core. - higher fast memory bandwidth per CUDA core. - better utilization of all CUDA cores instead of just 2/3 of them as in GTX660 - 2x GPU frequency to double everything above so that each SMX unit is like 256 CUDA cores equivalent now, with 2x square roots and more bandwidth - software optimizations for newest architecture (everyone knows this but not really much important) - ofcourse pixel rate will let you make more FPS when other parts are enough to keep up with the pace - If you compare processor size, do it with nanometer scaling: 14nm for pascal, 28nm for kepler so looks like 4x more transistors can fit on same area but ofcourse not simple as this, there are other things that doesn't let you fit 4x much but enough to surpass the GTX660 - pixel compression so that memory bandwidth or bitness is more irrelevant now Please compare these: - how much special function units (SFU) per cuda core: 1/4 for Pascal, 1/6 for Kepler, 1/8 for older - how much bandwidth per cuda core: much higher in Pascal (just 33% more from number of cores per smx, 100% more from frequency, +x from architecture) - texturing performance (40 TMU at 2GHz with much better architecture vs 80TMU at 1GHz) - gflops: 1800GFLOPS vs 1900GFLOPS, not much different besides GTX1050 can utilize at least 10-20 % more of its own peak value than the GTX660 does. - so that achieved gflops is 1500 GFLOPS vs 1000 GFLOPS - technologies: kepler can't overlap multiple work efficiently. Pascal has hyper-q technology to do multi tasking efficiently per smx unit. - technologies: pixel compression of gtx 1050 - technologies: dynamic parallelism of gtx 1050 - driver updates per month - best way: benchmarking https://gpu.userbenchmark.com/Compare/Nvidia-GTX-660-vs-Nvidia-GTX-1050/2162vs3650 because benchmarking is the best apples vs oranges comparison More compute-related benchmarks: - https://compubench.com/compare.jsp?benchmark=compu20d&did1=40772359&os1=Windows&api1=cu&hwtype1=dGPU&hwname1=NVIDIA+GeForce+GTX+1050&did2=4676&os2=Windows&api2=cu&hwtype2=dGPU&hwname2=NVIDIA+GeForce+GTX+660 - https://compubench.com/compare.jsp?benchmark=compu15d&did1=40772359&os1=Windows&api1=cl&hwtype1=dGPU&hwname1=NVIDIA+GeForce+GTX+1050&did2=4676&os2=Windows&api2=cl&hwtype2=dGPU&hwname2=NVIDIA+GeForce+GTX+660 On pure compute power, ofcourse gtx660 is better but thats a very optimized case scenario. Two links up here shows on average gtx1050 destroys gtx660 but on very limited cases gtx660 can still show itself. If you wonder AMD's HD7870(1280 cores), https://compubench.com/compare.jsp?benchmark=compu15d&did1=40772359&os1=Windows&api1=cl&hwtype1=dGPU&hwname1=NVIDIA+GeForce+GTX+1050&did2=22324593&os2=Windows&api2=cl&hwtype2=dGPU&hwname2=AMD+Radeon+HD+7870+GHz+Edition it is like a cache-nerfed gtx1050 because it has 64 cores per multiprocessor unit but not very efficient as gtx1050 in terms of core-to-core bandwidth. Thats why it loses to gtx1050 badly in particle benchmark but wins at T-rex rendering. Similarly a gtx titan(original) is slower than a gtx1060. So, if you just compare against a Pascal GPU, - reduce kepler(desktop) cores by 33% - reduce gcn(7000 series) bandwidth by 50% you should have some ballpark approximation of average optimized application codes. But on fully optimized ones, probably only in AAA+++ games and professional applications, old can still fight. Upvote 0 Downvote Solution

Tugrul_512bit

Distinguished

Nov 19, 2013 43 6 18,545

For example, I have 2x Quadro K420(Kepler) and just augmented Nvidia's CUDA toolkit sample "nbody" benchmark with a multitude of CUDA optimization techniques to achieve 65 percent of peak total GFLOPS of two cards: https://www.youtube.com/watch?v=aA6T-HPJeEE I guess a Pascal GPU would reach(or even surpass?) 65 percent out-of-box, without any optimization in CUDA codes. If someone with a GT1030 or a GTX1050 could try the nbody sample with -benchmark -numbodies=65536 parameters, I appreciate the feedback. I don't expect anyone to surpass 72% because nbody algorithm isn't a multiply-add-only algorithm. There are singular adds, singular multiplies and inverse-square-roots too! Those marketed GFLOPS values are only for fused-multiply-add commands. Upvote 0 Downvote You must log in or register to reply here. Share: Facebook X Bluesky LinkedIn Reddit Tumblr WhatsApp Email Share Link

TRENDING THREADS

NewsApple's 18-core M5 Max destroys 96-core Ryzen Threadripper Pro 9995WX in Geekbench — GPU performance is much less impressive
- Started by Admin
- Saturday at 6:20 AM
- Replies: 45
News Comments
NewsCalifornia introduces age verification law for all operating systems, including Linux and SteamOS — user age verified during OS account setup
- Started by Admin
- Mar 1, 2026
- Replies: 73
News Comments
DiscussionWhat's your favourite video game you've been playing?
- Started by amdfangirl
- Aug 3, 2014
- Replies: 4K
PC Gaming
DiscussionPSU recommendations and power supply discussion thread - Tom's hardware
- Started by Darkbreeze
- Jan 8, 2018
- Replies: 2K
Power Supplies
NewsThermal Grizzly scammed out of $46,000 by Alibaba metals suppliers — company spread the risk across two copper and aluminum suppliers, only for bot...
- Started by Admin
- Yesterday at 10:45 AM
- Replies: 17
News Comments
NewsClaude Code deletes developers' production setup, including its database and snapshots — 2.5 years of records were nuked in an instant
- Started by Admin
- Saturday at 8:00 AM
- Replies: 21
News Comments
NewsNintendo sues U.S. government over tariffs — Japanese videogame giant seeks 200 billion refund with interest
- Started by Admin
- Saturday at 11:30 AM
- Replies: 20
News Comments

Latest posts

Questionmp4 file restore from Ext HDD in Win 7
- Latest: Lutfij
- Today at 2:20 AM
Apps and Software
What's the difference between mobile and embedded processors?
- Latest: Darkbreeze
- Today at 2:07 AM
CPUs
QuestionMotherboard standby LED turns off the moment the CPU EPS cable is connected, even before power-on, while the 24-pin alone works normally ?
- Latest: Darkbreeze
- Today at 1:28 AM
Motherboards
A QuestionHow did amd manage to do even worse???
- Latest: axlrose
- Today at 1:24 AM
Graphics Cards
QuestionHaving the overlay active partially solves my micro-stutter problems, as well as VRR and Freesync issues ?
- Latest: Darkbreeze
- Today at 1:23 AM
Graphics Cards
QuestionRAM upgrade weirdness - - I'm stumped ?
- Latest: Darkbreeze
- Today at 12:25 AM
Motherboards
F DiscussionDid anybody have an SSD fail on them -- how long do they last ?
- Latest: fzabkar
- Today at 12:14 AM
Storage

Moderators online

Share this page

Facebook X Bluesky LinkedIn Reddit Tumblr WhatsApp Email Share Link COMPANY

Space.com is part of Future plc, an international media group and leading digital publisher. Visit our corporate site. © Future Publishing Limited Quay House, The Ambury, Bath BA1 1UA. All rights reserved. England and Wales company registration number 2008885. RESOURCES

Advertising
Cookies Policies
Privacy
Term & Conditions
Topics

Facebook

Twitter

Instagram Top Bottom

Từ khóa » Gtx 660 Vs Gtx 1050

Why Does GTX 1050 Out Perform GTX 660? | Tom's Hardware Forum

Search

Search

ghost47

Honorable

Tugrul_512bit

Distinguished

Tugrul_512bit

Distinguished

TRENDING THREADS

Latest posts

Moderators online

Share this page

Nvidia GTX 1050 Vs 660 - GPU UserBenchmarks

Galaxy GeForce GTX 660 Vs Nvidia GeForce GTX 1050

Nvidia GeForce GTX 1050 Vs Nvidia GeForce GTX 660

GeForce GTX 1050 Vs 660 [in 5 Benchmarks] - Technical City

GTX 660 Vs GTX 1050 Điểm Chuẩn Hiệu Suất Trò Chơi (i7-2700K Vs ...

GTX 660 Vs GTX 1050 Ti Điểm Chuẩn Hiệu Suất Trò Chơi (i7-2700K ...

GeForce GTX 1050 Ti Vs GeForce GTX 660

GTX 1050 Vs GTX 660 Vs GTX 750 Ti GTA V - YouTube

GeForce GTX 1050 Vs GeForce GTX 660 - HWBench

GeForce GTX 660 Vs GeForce GTX 1050 Vs Radeon RX 560

GeForce GTX 1050 Vs GeForce GTX 660 GPU Comparison

Nvidia GeForce GTX 1050 Ti 4GB - Game-Debate

Which Is Better, A GTX 1050 TI Or 660? - Quora

معيب خذ دواء أخلاق Gtx 660 Vs Gtx 1050

Liên Hệ